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Abstract 

A  Monte  Carlo  method  to  solve  radiation  transport  problems  by  using 
the  ILLIAC  IV  is  discussed.   An  emphasis  is  put  on  the  special  features 
to  be  encountered  in  parallel  computation:   data  structure,  PE  efficiency, 
etc.   A  test  program  implemented  in  GLYPNIR — an  ILLIAC  IV  language — is 
shown  and  some  preliminary  results  on  its  statistical  properties  are  also 
discussed. 
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1 .   INTRODUCTION 

The  Monte  Carlo  Method  started  its  history  in  1777  when 
Conte  de  Buffon  (1707-1788)  presented  a  famous  problem  to  calculate  tt  by 
tossing  needles  (Buffon's  problem).   (Ref.  [1]  ).   Real  progress  was 
not  made  until  digital  computers  became  available.   In  1947 
J.  von  Neumann,  R.  Richtmyer  and  S.  Ulam  used  the  ENIAC  to  solve  neutron 
diffusion  problems  by  simulating  the  movement  of  particles.  (Ref.  [12]). 
Since  then,  development  has  been  remarkable  in  fields  such  as 
partial  differential  equations,  definite  integrals,  serving  systems, 
linear  equations  and  transport  problems.  (Ref.  [10]  ).   In  transport 
problems  a  necessity  to  calculate  neutron  flux  within  reactors  or  gamma- 
ray  flux  through  shielding  walls  has  encouraged  the  development  of  more 
efficient  algorithms. 

Due  to  their  statistical  nature,  simulation  of  random  processes 
on  a  digital  computer  requires  a  large  memory  and  fast  logical  and  arith- 
metic operations  to  obtain  necessary  accuracy.   The  aarallel  structure  of 
the  ILLIAC  IV  computer,  although  originally  designed  for  matrix  problems 
or  partial  differential  equations,  is  applicable  to  random  simulation. 
This  paper  is  concerned  with  the  application  of  the  ILLIAC  IV  to  radiation 
transport  problems. 

In  this  chapter,  a  physical  model  and  the  basic  equations  are 
presented.   Certain  other  assumptions  are  also  discussed.   Special  fea- 
tures to  be  encountered  in  parallel  computation  will  be  discussed  in  the 
subsequent  chapters. 


1. 1   Basic  Equations 

The  physical  phenomenon  described  in  this  paper  is  "non- 
linear radiation  transport"  in  the  low  energy  region,  which  is  en- 
countered, for  example,  in  the  study  of  stellar  atmosphere.   (Ref.  [2], 
[4],  [7]). 

Consider  a  block  of  material  (or  gas,  to  be  referred  to  as 
"medium"  hereafter)  which  is  not  necessarily  homogeneous.   Let  a  beam  of 
photons  be  incident  on  it  (or  let  a  heat  source  be  attached  to  it).   Some 
of  the  photons  will  pass  through  the  medium  undisturbed,  some  will  be 
absorbed  by  the  medium  and  release  their  energy  to  it,  while  others  will 
be  scattered  into  new  directions  (with  new  energies).   Since  the  absorption 
of  photons  affects  the  local  temperature  and  the  change  of  the  temperature 
itself  affects  the  emission  of  photons,  the  whole  process  is  rather  com- 
plicated.  Evolution  of  the  photon  intensity  and  the  temperature  is  given 
in  the  form  of  a  system  of  nonlinear  integro-dif ferential  equations  as 
will  be  shown  below. 

Let  dE  denote  the  radiation  energy  in  a  frequency  range 
(v,v,  +  dv)  which  is  transported  across  an  element  of  area  dS  at  x  and 
in  directions  confined  to  a  solid  angle  dfl  during  a  time  dt.  (Fig.  1.1). 
Then  the  intensity  I  (x,u,v)  is  defined  as  follows: 

dEv=  I  (x,u,v)  cos  0  dvdSdftdt 
The  basic  equation  which  I  satisfies  is: 

—    I  (x,u,v)  =  -  atot  I  (x,y,v)  +  B(x,y,v) 
ds 

+  ascatt  •  //  P  (v+y)    I  (x,y,v)dy'dft 

(1,1) 
The  left  hand  side  of  (1,1)  is  the  total  change  of  the  intensity  along  the 
path  of  radiation;  its  more  explicit  form  is  dependent  on  the  geometry 


which  one  is  to  consider  (Cartesian,  spherical,  etc.).   Table  1.1  shows 
some  examples. 

The  first  term  on  the  right  hand  side  represents  diminution  of 
the  radiation  caused  by  various  kinds  of  interaction:   absorption,  scat- 
tering, or  pair-creation  (only  in  high  energy  region).   The  probability  of 
the  occurrence  of  each  interaction  is  expressed  in  terms  of  cross  section, 
atot  is  the  total  cross  section  and  of  course  atot  =  aabs  +  ascatt  +  .  .  .  . 
The  second  term  is  the  source  term  which  includes  the  black  body  radiation 
(note  that  the  absorption  of  photons  causes  re-emission  of  new  photons 
through  this  process),  pair-annihilation  (only  in  high  energy  region)  and 
some  other  heat  sources.   In  astrophysics,  the  assumption  of  local  thermo- 
dynamical  equilibrium  is  often  adopted.   That  is,, if  the  temperature  in  the 
medium  varies  very  slowly  with  relation  to  position  then  the  laws  of  thermo- 
dynamic equilibrium  are  still  applicable  at  each  point  with  the  local  tem- 
perature.  Therefore,  photons  are  re-emitted  according  to  "Kirchhoff's  law:" 

B(x,y,v)  =  aabs  •  2hy   1 

c^   exp  /-hv-j 

where  aabs  and  T  depend  on  x. 

Note  that  the  temperature  T  is  involved  in  a  non-linear  way. 

The  third  term  is  the  scattering  term  and  represents  a  contri- 
bution from  other  directions  u"  into  y.   P  denotes  the  scattering  probabil- 


ity from  y'  to  y.   If  Thomson  scattering  is  assumed,  then 

3_ 
16 


2 
P(cos  0)  =  3    (1  +  cos  0) 


and 

2 

li"  =  ]s    '    cos  0  -/  1-u*   sin  0  cos  (ip'-ty) 

(See  Fig.  1.2) 


The  temperature  T,  on  the  other  hand,  changes  with  time  according  to  the 
following  equation: 


pCviT=  c^  Jo  aabs  v 


(rl) 


exp 


hv 


kT 


-  1 


dv 


where 


P 
Cv 

aabs 


+  /   dv   J  aabs 
-1      0 


density 
specific  heat 


I  (x,u  ,v)   dv 


(1,2) 


absorption  cross  section 

(dependent  on  x  and  T  also  depends  on  x) 


The  left  hand  side  of  (1,2)  is  the  change  of  stored  energy  in  the  medium. 
The  first  term  on  the  right  hand  side  is  the  loss  of  energy  due  to  the 
black  body  radiation.   The  second  term  is  the  gain  of  energy  due  to  the 
absorbed  photons.   Hence,  (1,1)  and  (1,2)  describe  the  energy  balance  of 
the  total  system.  (Fig.  1.3).    If  I  is  averaged  over  frequency,  then  those 
equations  become  simpler  .   That  is, 


ds 


I(x,y,t)   =  -    (aabs  +  ascatt)    I(x,u,t)   +  ascatt   J    I(x,U    ,t) 


-1 


and 


P(y,u')dy'  +  aabs  qstephan  T(x,t)   +  S(x,y) 

IT 

a,ir 
,  i 

pCv     aT(x,t)      =  -  aabs     qstephan   T(x,t)      +  aabs  J      I(x,y "  ,t)dy 
3t  tt  -1 

(1,2) 


1.2   The  Model  for  Monte  Carlo  Calculation  (general  discussion) 

This  section  describes  the  model  implemented  on  ILLIAC  IV. 

(1)   The  geometry  is  two  dimensional.   (Density  is  assumed  to  be 

uniform  with  regard  to  the  third  coordinate  although  particles 


travel  in  three  dimensional  space). 

(2)  The  material  is  expressed  as  64  x  8  cells  (it  is  easy  to 
extend  up  to  64  x  64);  each  cell  can  have  variable  density  and 
specific  heat. 

(3)  Absorption  is  due  to  the  photo-electric  effect,  and  its  cross 
section  can  be  factored  as  a  density  dependent  term  and  another 
term  which  is  dependent  on  photon  energy.   The  same  thing  is 
assumed  as  for  scattering  cross  section. 

aabs  =  p(x,y)  •  k(E) 

oscatt  =  p(x,y)  *  A(E) 
p(x,y),  k(E)  and  X(E)  can  be  stored  as  tables  and  linear  inter- 
polation is  used  to  generate  cross  sections. 

(4)  The  number  of  particles  which  are  to  be  created  is  dependent 
on  the  local  temperature.   There  are  two  alternatives  as  to  their 
energies:   Planckian  distribution  or  mono-energetic  (the  latter 
is  adopted  for  the  test  program) . 

Photons  are  modeled  as  a  particle  table.   The  term  "particles" 
should  be  distinguished  from  "photons"  hereafter.   The  former  means 
pseudo-photons  within  a  computer  whereas  the  latter  means  actual  photons. 
The  table  size  is  limited  by  the  size  of  memory  available.   Refer  to 
Chapter  3  for  the  actual  table  size.   The  medium  has  been  divided  into 
small  cells  of  Ax  •  Ax.   The  time  unit  is  At. 
1. 3  Creation  of  Particles 

At  the  beginning  of  a  time  step,  the  number  of  particles  to  be 
created  in  cells  is  determined  according  to  the  temperature.   As  the  table 
size  is  limited,  more  particles  are  assigned  to  those  cells  which  are  hot. 
Therefore,  only  the  cell  numbers  are  assigned  to  the  new  particles  at  this 


stage. 

The  creation  of  new  particles  follows.   The  quantities  given 
below  are  assigned  to  each  of  the  new  particles: 

(1)  relative  coordinates  x  and  y  .  .  .  . (uniformly  distributed) 

(2)  direction  cosine  u (uniformly  distributed) 

(3)  sine  and  cosine  of  azimuthal  angle  . (uniformly  distributed) 

(4)  survival  distance  (path  length).  .  .(exponentially  distrib- 

uted) 

(5)  fraction  g  to  indicate  at  what  time   £;At  this  particle 

was  created (uniformly  distributed) 

1.4   Processing  a  Time  Step 

After  all  the  new  particles  are  created,  the  table  scanning 
starts.   The  position  of  a  particle  is  incremented  as  follows: 

x'  =  x  +  vu_At 

/ — 2 
y'  =  y  +  v  /1-u  sin  $   -At 

The  distance  which  a  particle  is  supposed  to  travel  (1  =  vAt)  is  compared 
with  -r/otot  where  otot  =  aabs  +  ascatt,  and  if  the  former  is  larger  than 
the  latter  an  interaction  is  assumed  to  occur.   The  type  of  interaction 
is  determined  by  generating  a  uniformly  distributed  random  number  and 
comparing  it  with  aabs/atot.   If  the  random  number  is  less  than  this  quan- 
tity, absorption  is  taken  and  the  energy  of  the  particle  is  added  to 
the  cell  in  which  the  interaction  occurred.   This  particle  is  to  be  de- 
stroyed. 

On  the  other  hand,  if  the  random  number  is  greater  than  this 
quantity,  then  the  interaction  is  scattering  and  new  angles,  new  survival 
distances  are  selected  and  assigned  to  the  particle. 


1.5   The  New  Temperature 

After  the  scanning  of  the  particle  table,  the  intensity  in  each 
cell  is  computed  by  summing  the  particles  in  it.   Then  the  new  temper- 
ature is  computed. 


The  last  three  sections  are  a  narrative  description  of  the  Monte 
Carlo  program.   A  flow  chart  of  the  computation  process  (not  the  program) 
is  shown  in  Figure  1.4.   In  the  subsequent  chapters  particular  problems 
for  parallel  processing  are  to  be  discussed. 
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Figure  1.1  Definition  of  Intensity 
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Figure   1.2     Definitions   of  Angles   in   Scattering  Process 
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Figure   1.4     A  Flow  Chart   of   a  Monte   Carlo   Program 
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2.   RANDOM  SAMPLING 

An  important  part  of  a  Monte  Carlo  method  is  the  generation  of 
good  random  numbers.   The  generation  of  uniformly  distributed  random 
numbers  is  discussed  in  Section  2.1.   Transformation  from  the  uniform 
distribution  to  an  arbitrary  distribution  is  discussed  with  some  examples 
which  are  encountered  in  transport  problems  (Sections  2.2  and  2.3).   In 
this  and  subsequent  chapters,  an  emphasis  is  put  on  the  application  of  the 
Monte  Carlo  method  to  the  ILLIAC  IV  computer,  and  readers  are  assumed  to  have 
a  fundamental  knowledge  of  it  (especially  its  architecture).   (Ref.  [11]). 
Terms  such  as  PE  (the  processing  element),  CU  (the  control  unit)  will  be  used 
2. 1   Linear  Congruential  Method 

Since  von  Neumann's  middle-square  method  several  methods  have 
been  investigated  to  generate  uniformly  distributed  random  numbers. 
(Ref.  [6]).   The  linear  congruential  method  is  generally  accepted  to  be  the 
best  for  digital  computers  for  the  following  reasons: 

(1)  it  is  easy  to  calculate  the  subsequent  random  number  from 
the  updated  one, 

(2)  only  a  very  small  space  in  the  memory  is  required  (unlike 
storing  the  entire  or  partial  table  of  random  numbers), 

(3)  repeated  reproduction  of  the  same  sequence  is  possible, 

(4)  the  period  of  random  sequence  is  long  (if  proper  coefficients 
are  chosen) . 

The  linear  congruential  method  can  be  described  by  the  following 
formula: 


11 


Xn+1  =  (AXn  +  C)  mod  T  (2,1) 

where  A  and  C  are  constants,  Xn  is  the  nth  random  number  and  T 
is  also  a  constant.   The  initial  value  Xo  is  given.   Xn  ranges  between  0 
and  T.  &n=Kn/T   ranges  in  (0,1).   The  method  is  called  multiplicative  if 

C  =  0.   Otherwise,  it  is  called  mixed  congruential.   T  is  usually  chosen 

48 
to  be  the  register  length  of  the  computer  (2   in  ILLIAC  IV) .   A  and  C 

must  be  such  that  Xn  yields  good  uniform  random  sequence.   Consult  reference 
[6]  or  [14]  for  detail,  since  the  principle  of  this  method  is  not  to  be 
presented  in  this  paper.   The  method  which  has  been  adopted  for  the  ILLIAC 
IV  is  the  multiplicative  congruential  method  (Ref.  [14])  but  it  is  also 
possible  to  use  the  mixed  congruential  method.   In  both  methods  the  coef- 
ficients should  be  chosen  to  give  the  longest  possible  period  of  random 
sequence  so  that  one  sequence  of  random  numbers  can  be  decomposed  into  64 
sufficiently  long  subsequences.   The  easiest  way  is  to  divide  the  entire 
sequence  into  64  subsequences  and  generate  each  subsequence  in  a  different 
PE.   The  problem  with  this  method  is  the  serial  correlations  of  random 

numbers  across  PEs  or  within  each  PE.   For  example,  if  each  processor  ele- 

20 
ment  starts  with  every  2   random  number  of  the  original  sequence  then  the 

20 
correlations  of  random  numbers  separated  by  distance  2   or  its  multiples 

must  be  closely  examined,  because  they  will  be  generated  simultaneously  in 
PEs  and  any  non-negligible  correlation  among  them  could  cause  trouble. 
Nothing  is  known  about  the  correlations  over  that  long  distance  for  the  co- 
efficients currently  recommended  for  conventional  machines.   Usually  serial 
correlations  of  distance  up  to  10  or  20  are  examined.   Therefore,  closer 
attention  must  be  paid  to  choosing  proper  coefficients. 

There  is  a  formula  about  the  bound  of  serial  correlation  factor 
of  any  distance  r  (Ref.  [5]): 
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A    ,  ...    6C     , 

p  <  __r  +  1_  (1  - r  +   6 

r~   T    A        T 
r 


C 
r 


)  (2,2) 


where   X      =  (A  Xs  +  C  )  mod  T 
s  +  r     r      r 

A  =  Ar  mod  T 
r 

C   =  C  •  Ar-1    .  m 
r        .  .,  mod  T 
A-l 

n-r  n     2 

I      \    '    Xk+r    "  S   \) 
p   =  lim  k=l   R    k+r      k=l  * 

r        1 

n-x»   n    o    n     7 

I     \     ~    (I     \) 
k=l   *     k=l  k 

A_.  ~  /f  in  (2,2)  yields  the  smallest  serial  correlation  factor  pr  in  mixed 
congruential  methods.   But  it  is  impossible  to  choose  A  so  that  A.  ,  A„ , .  .  . 
all  become  the  order  of  /T  .   Therefore,  an  actual  test  run  in  each  case  is 
necessary.   Refer  to  a  document  to  be  published  in  the  future  for  the  result 
of  the  example  given  in  Reference  [14].   (A=5   ,  C=0,  T>2   ). 

The  method  to  use — multiplicative  or  mixed — is  another  problem. 
Statistical  tests  have  shown  that  the  multiplicative  method  is  better  than 
the  mixed.   (Ref.  [9]).   The  advantage  of  the  mixed  method  is  that  all  the 
integers  between  0  and  T  are  covered  so  that  sequence  can  be  started  at  any 
number.   In  the  multiplicative  method,  on  the  other  hand,  the  initial  ran- 
dom number  must  be  chosen  very  carefully.   The  actual  algorithm  used  for  the 
computation  on  ILLIAC  IV  will  be  given  in  a  document  to  be  published. 
2.2   Random  Sampling  from  a  Given  Distribution  Function 

There  are  several  methods  to  sample  random  numbers  of  a  given 
distribution  of  uniform  random  numbers. 
2.2.1   Rejection  Method 

The  rejection  method  was  proposed  by  von  Neumann  and  was  used  in  his 
works  on  neutron  transport  simulation.   (Ref.  [13]).   Theoretically,  this  method 
can  be  applied  to  any  type  of  density  function.   It  is  not  efficient,  however, 
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for  the  functions  which  have  sharp  maxima.   In  those  cases  many  random 

numbers  are  wasted  and  as  a  consequence  the  number  of  trials  increases. 

If  a  density  function  has  a  domain  [a,b]  and  maximum  value  f    ,  the  rate 

max 

of  success  is  l/(f    x  (b  -  a)). (Fig.  2.1).   This  disadvantage  is  disas- 
max 

trous  if  the  method  is  applied  in  a  straightforward  way  to  parallel 

sampling,  since  one  cannot  proceed  to  the  next  step  until  all  the  PEs 

accept  random  numbers. 

Suppose,  for  example,  that  there  are  N  PEs,  each  of  which  samples 

a  random  number  in  parallel  using  the  rejection  method.   Let  p  denote  the 

probability  of  accepting  a  random  number  in  a  PE  in  each  individual  trial, 

i.e.  p  =  l/(f    x  (b  -  a)).   Then  the  probability  that  each  PE  has 
max 

accepted  at  least  one  random  number  after  r  trials  is 

P  (p,  r,  N)  =  {1  -  (1  -  p)r}N  (2,3) 

This  can  be  shown  very  easily  because  (1  -  p)   is  the  probability  that 
none  of  the  r  trials  are  successful  for  individual  PE. 

Take  Planckian  case,  for  example.   Then 

15        x3 

f  (x)  =  —  *   7-t t-  f    =  .23 

4    exp  (x)  -  1    max 

IT 

f(10)  =  .8  x  10"2  .*.  F(10n  (Take  b  =  10  and  a  =  0) 
then  p  =  1/  (23  x  10)  =  .435 
with  N  =  64  (ILLIAC  IV), 
P  (.435,  r,  64)  =  {1  -  (.565)r}64 
Several  curves  are  plotted  in  Figures  2.2  and  2.3.   Figure  2.2 
shows  a  rapid  fall-off  of  the  efficiency  with  the  decrease  of  p. 

As  models  become  more  complicated,  various  kinds  of  distribution 
functions  will  appear  to  determine  scattering  angles  for  high  energy  par- 
ticles (Klein-Nishina's  formula)  or  pair  creations  followed  by  reannihilation 
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etc.   Each  of  these  processes  contains  several  branches  in  itself;  overall 
efficiency  is  not  high.   Table  2.1  shows  some  examples  taken  from  Reference 
[15].   Efficient  implementation  of  these  processes  using  the  rejection  meth- 
od for  the  ILLIAC  IV  is  very  difficult. 

2.2.2  Method  of  Inverse  Function 

If  a  distribution  function  F  has  an  inverse  function  F   which 
is  not  complicated  to  evaluate,  then  a  random  number  x  can  be  sampled 
from  f=F'  according  to  the  following  formula: 

R  =  [I   f(x')dx' 

or  x  =  F_1  (R) 

where  f  is  a  density  function,  and  R  is  a  uniformly 

distributed  random  number 
The  advantage  of  this  method  is  that  all  the  PEs  can  produce  random  num- 
bers simultaneously.   But  this  advantage  will  be  cancelled  out  if  F   is 
very  complicated  and  takes  a  long  time  to  evaluate.   Exponential  distri- 
bution is  a  typical  example  for  use  of  this  method: 

f(x)  =   Xexp(-Xx),  x  =  1/X   log  (1-R) 

where  R  is  a  sampled  uniform  random  number  . 

2.2.3  Approximation  of  Distribution  Functions  by  some  Simpler  Ones 

If  the  inverse  of  distribution  functions  cannot  be  obtained 
explicitly,  or  are  not  simple  to  calculate,  they  are  approximated  by  some 
simpler  functions,  or  more  frequently  by  a  set  of  data  points.   The  latter 
is  essentially  equivalent  to  the  inverse  linear  interpolation  with  table 
lookup  technique. 

The  method  can  be  described  as  follows : 
The  region  (0,1)  on  the  ordinate  (i.e.,  accumulated  distribution  function) 
is   divided  into  N  equal  subintervals .   Then  a  uniform  random  number  R  is 
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sampled  and  the  interval  in  which  it  is  located  is  checked.   A  random 
number  X  with  required  distribution  f  is  calculated  by  using  ordinates  and 
abscissas  of  the  two  points  which  are  the  edges  of  the  selected  subinter- 

vals,  i.e. ,  if  Fk<R<Fk+1,  then  x  =  (Fk+1  -  R)  •  Xfc  +  (R  -  Ffc)  "  X^+±      = 

~~ p      -  F 
k  +  1 

Xk(k  +  1  -  N-R)  +  \+1(N-R  -  k)  (2,4) 

The  basic  idea  of  this  method  is  to  chop  the  graph  of  a  given 
probability  distribution  function  f  into  N  strips  of  equal  area  and  ap- 
proximate f  by  the  average  fk  in  each  interval  so  that  fk  •  (X,  ,  -  X,  )=1/N, 

(Fig.  2.4).   Obviously  the  accuracy  is  good  near  maximum  point  of  f  where 
the  width  of  the  strip  is  relatively  narrow.   On  the  other  hand,  it  is  not 
good  in  the  region  where  f  is  close  to  zero  (e.g.  edge  regions  of  Gaussian 
or  Planckian  type  distributions).   Figure  2.5  shows  an  example  of  Planckian 
distribution.   (Sampling  number  =  10,000,  N  =  32). 

Sometimes  a  density  function  contains  a  parameter  y  .   Consider 
the  following  example: 

F(x,u)  =  1/8  {(3-y2)x  +  (y2  -  l/3)x3}   0<x,  y<l  (2,5) 

which  is  used  to  determine  a  new  direction  cosine  x  from  an  old  one,  y5  in 
one  dimensional  radiative  transport  problems.   Interpolation  in  this  case 
involves  two  variables,  x  and  y.   Data  is  stored  as  a  reference  table;  they 
are  values  x(k,m)  (0<k,m<n) ,  which  satisfy  F(x(k,m),ym)  =  k/N  (0<k,m<_l).   A 
random  number  x  for  given  y  is  then  calculated  by: 

x(y)  -  (ym+1  -  y)  *(v=vm)   +  (y-ym)  x(y=ym+1) 

=  X(k,m)(k+l-N-R)(m+l-y.N)  +  X(k,m+1) (k+l-N'R) (u'N-m)  +X(k+l,m) 

(N-R-k)(m+l-yN)  +  X(k+l,m+l)  (N-R-k)  (y  -N-m)  where  y  <y<y    , 

m    m+x 

X(i,j)  =  F~   (i/N,y.)  and  k,m  are  chosen  so  that  (R,y)  in  X-y 
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plane  is  inside  a  trapezoid  formed  by  four  points  (X(k,m) ,m) , (X(k+l,m) ,m) , 

(X(K,m+l),m+l),  and  (X(K+1 ,m+l) ,m+l) . 

More  precisely,  linear  interpolation  is  first  applied  to  the  two 

planes  u  =  u  and  u  =  u    separately,  then  the  two  lines  obtained  as  a 
m         m+1 

result  are  used  to  get  the  ultimate  interpolation  line  which  is  parallel  to  the 
x  =  constant  plane  and  on  which  the  point  in  question  is  located.   Therefore, 
the  following  errors  must  be  taken  into  account: 


E  I  <  max 

mi  —  t  v 

(V  \+i> 


i2|        •  (xk+1  -  x) (x  -  ^  1% 

3x  |  y=Mm 


E  ,J< 
m+1 1  _  max 

'  (v  W 


& 


3x^ 


v=n 


'  (\+l  ~  x)(x  "  Xk)  /2 


m+1 


mi 


=  maX  (lEml'  IVlP' 


=  max 

(^m'  ^m+l} 


3y' 


(Um+1  -  V)(viij    /2, 


x  =  x 


E      <  E   +  E    <  (h2M  +  k2M'  )  /8 
tot  —   m     x  — 


(2,7) 


where 


h   =  (\+l  "  Xk) 

k  =  ymxi  "  y 

M  =  max 


'm+1 

2 


M'  =  max 


3  f 
3X7 

2, 


3|i' 


Example:   N  =  32 


F  ={{3  -y2)  •  x  +  (y2  -  l/3)x3  }•  3/8 

M  -  3/2   (0<x,y<l) 

H  =  /a/ 6  (0<x,y<l) 

,  '.  E    <  1/8  •  1/1024  '  (3/2  +  /l/e   ) 
tot  — 


2.2  •  10 


-5 
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This  method  is  good  when  memory  size  is  not  valuable  compared 
with  the  number  of  arithmetic  operations  or  when  F(x,y)  is  very  complicated 
to  evaluate. 
2.2.4   Root-finding  Method 

Sometimes  formulae  for  functional  iteration  (such  as  Newton's 
formula)  are  useful  to  generate  random  numbers. 

Suppose  the  probability  density  function  is  p(x)  and  its  accumu- 
lated distribution  function  is  F(x).   If  a  random  number  R  is  sampled  out 
of  a  uniform  population  in  [0,1],  then  F(x)  -  R  =  0  when  solved  for  x, 
yields  a  random  sample  from  p(x).   Several  iterations  of  Newton's  method 
are  enough  to  get  a  satisfactory  result  if  F(x)  is  not  too  ill-behaved. 

It  is  very  important  in  parallel  computation  that  all  PEs  get 
accurate  values  after  a  certain  number  of  iterations,  regardless  of  the 
distribution  of  Rs.   (Note  that  all  PEs  have  different  Rs) .   In  the  ILLIAC 
IV,  any  logical  decision  which  involves  the  CU  (control  unit)  takes  a  long 

time  and  is  desirable  to  avoid.   The  following  illustrates  this.   To  de- 

2 
termine  a  scattering  angle  in  Thomson  scattering,  p(x)  =  3/4  (1  +  x  ) 

3 
(0<x<l).   Hence  F(x)  =  3/4  (x  +  x  )  and  Newton's  formula  to  calculate  a 


next  approximation  is: 

x  3 

X  ,-  =  X  -  F(Xn)  =  4/3  •  R  +  ^- 
n+1    n     .      r         I 

UXn;         1  +  Xn^  (2,8) 

Xn  =  . 7  turned  out  to  be  the  best  as  the  initial  value;  three  iterations 

—8 
give  satisfactory  results  (<10   ),  whatever  R  may  be. 

In  general,  this  method  is  not  very  efficient  if  evaluation  of  the 

right  hand  side  of  (2,8)  requires  much  computation;  i.e.,  some  transcendental 

functions  such  as  exponential  or  sine  are  involved.   The  third  method  and 
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the  fourth  method  of  this  section  are  complimentary  to  each  other  in  a 

sense.   Note  that  both  the  root-finding  and  the  inverse  function  method 

share  an  advantage  over  the  interpolation  method.   They  require  less  memory 

space. 

2. 3   Some  Remarks  on  Random  Sampling 

2.3.1  Forcing  (sampling  exactly  n  samples  out  of  N) 

Forcing  is  one  of  the  techniques  to  reduce  the  variance  of  sampling. 
Suppose  10%  of  samples  out  of  a  population  which  contains  N  elements  are 
selected.   The  simplest  way  is  to  generate  random  numbers  R  and  compare- 
them  with  0.1.   If  R<0.1  at  the  Kth  trial  then  select  the  Kth  element. 
Thus  at  the  end  of  N  trials,  0.1N  samples  on  the  average  will  be  obtained. 
But  sometimes  exactly  0.1N  samples  might  be  needed. 

The  alternative  method  is  to  keep  changing  the  criterion  at  each 
step  so  that  exactly  0.1N  samples  can  be  obtained  at  the  end.   This  is 
shown  in  Reference  [6],  for  example.   The  probability  of  selecting  the  K  -  1st 

element  is  0. IN  -  n  if  n  samples  have  already  been  sampled  in  the  K  steps. 

N-k 

This  method  is  applicable  to  the  ILLIAC  IV  in  the  usual  way  within 
PEs.   It  is  difficult,  however,  to  apply  it  to  select  exactly  n  PEs  out  of 
64,  although  this  kind  of  problem  may  be  occasionally  encountered.   The 
reason  is  that  the  process  is  inherently  sequential  in  the  sense  that  the 
probability  to  pick  up  the  t  +  1st  element  is  dependent  on  the  number  of 
elements  selected  so  far.   Therefore,  an  efficient  parallel  selection  is 
difficult. 

In  the  case  in  which  the  sampling  criterion  n  is  large  it  is 
possible  to  increase  the  efficiency  by  dividing  64  PEs  into  several  sub- 
groups of  the  same  number  of  PEs  such  as  4  groups  of  16  PEs,  and  each  group 
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selects  n/4  PEs  randomly  within  itself.   But  the  sampling  is  not  free  from 
biasing  and  it  might  cause  trouble  in  some  cases. 
2.3.2   Shuffling  (or  random  permutation  of  data) 

Shuffling  is  also  a  task  which  is  not  easily  adopted  to  the  ILLIAC 
IV.   Data  can  be  shuffled  in  both  ways — within  PEs  or  across  PEs.   The 
former  case  causes  no  trouble  and  the  algorithm  to  be  stated  below  works 
in  parallel,  but  one  will  encounter  the  same  difficulty  as  with  forcing 
in  the  latter  case.   There  are  two  problems  involved  in  the  random  permu- 
tation:  (1)  how  to  generate  addresses  to  send  data  to  and  (2)  how  to  send 
data  actually.   The  first  problem  is  equivalent  to  generating  64  random 
integers  (0  %  63)  each  of  which  is  different  from  all  others.   If  each  PE 

generated  a  random  integer  independently  of  others,  the  probability  of 

64    -64 
success  is  641/64   ^  e    .   The  algorithm  suggested  by  Knuth  (Ref.  [6]) 

is  applicable,  but  not  efficient.   It  states: 

Suppose  the  data  are  A(l)  through  A(N).   In  the  first  step  of  the 
problem  given  above,  A(i)  is  in  the  i  -  1th  PE  and  exchange  of  data  is 
performed  by  means  of  the  routing  registers.   Now,  there  is  a  register  R 
which  contains  an  integer.   Initial  value  of  R  is  N.   Then  a  random  integer 
S  is  generated  which  ranges  from  1  to  the  content  of  R.   Interchange  A(S) 
with  A(R) .   Decrease  the  content  of  R  by  1.   Generate  S,  interchange,  .  .  . 
and  so  forth.   This  process  is  continued  until  R  becomes  1. 

Obviously,  the  method  is  sequential  and  64  pairs  of  routing  (for- 
ward and  backward)  are  necessary. 

The  second  step  of  the  given  problem  (namely,  sending  data  to  a 
specified  address)  is  a  special  case  of  the  table  lookup  method,  in  which 
the  one-to-one  correspondence  between  the  sources  and  destinations  is 
guaranteed.   The  only  conceivable  method  is  to  use  the  routing  register  as 
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"conveyer"  to  put  all  the  data  on  and  to  drop  the  appropriate  data  to 
their  destinations  after  each  routing  of  distance  1.   This  takes  64  rou- 
tings in  the  worst  case. 


Table  2.1 


Type  of  Processes 


(1) 


p  =  n 


-1(2) 


(3) 


exponential  distribution       4.3 

sin<{>  and   cos<j> 

(   $:    uniform) 

direction  cosine  (Covoyou)     5.42 
direction  cosine  (von  Neumann)  5.15 

Compton  scattering 

Compton  scattering  (improved) 


f  -.785 
4 


pair  creation 


not  shown 


•  34~.65 
.75~.93 

not  shown 


Note: 

(1)  n 

(2)  p 

(3)  b 


the  number  of  generated  random  numbers  on  the  average 

efficiency  (reciprocal  of  n) 

the  number  of  branches  in  a  program  which  describes  each  process 
The  more  the  number  of  branches  is,  the  less  the  PE  efficiency 
is. 
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<KRi,R3<fMAX 
a<R2,R4<b 

(UNIFORMLY  DISTRIBUTED) 
(Rl,R2)  REJECTED 
(R3,R4)  ACCEPTED 


*•  X 


Figure  2.1  The  Rejection  Method 
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Figure  2.3  The  Probability  Density  Function  of  the 
Number  of  Trials  to  Get  All  N  Successful  Results  (p :   parameter,  N=64) 
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3.   FUNDAMENTAL  STRUCTURE  OF  THE  PROGRAM 

Both  this  chapter  and  the  next  chapter  describe  a  parallel  version 
of  the  Monte  Carlo  calculation  of  radiation  transport.   This  chapter  is 
concerned  with  the  fundamental  structure  and  the  problem  of  data  stor- 
age.  In  Section  3.1,  several  basic  schemes  are  discussed  and  compared. 
The  "Particle  Migration  Scheme"  has  been  adopted  for  the  problem  in  Chapter 
1. 

The  actual  data  structure  of  the  "Particle  Migration  Scheme"  is 
described  in  detail  in  the  next  section.   Several  techniques  such  as  con- 
version of  real  type  data  into  integers  and  chain  structure  are  also  dis- 
cussed. 

3.1   Basic  Program  Structures  and  Their  Comparison 

3.1.1  Geometrical  Structure  of  the  Medium  and  Hardware  Structures  of  the 
Computer 

It  is  assumed  in  the  subsequent  discussions  that  the  medium  is 
a  two  dimensional  rectangle.   Some  examples  are  shown  in  Figure  3.1.   The 
simplest  case  is  when  the  density  of  the  medium  p  is  a  constant  or  slow- 
varying  function  of  coordinates  so  that  the  table  for  p  can  be  contained 
within  a  PE.   Each  PE  can  then  cover  the  whole  domain  of  the  medium.   In 
such  a  case  it  does  not  matter  which  PE  processes  which  particle;  it  remains 
in  the  same  PE  until  destroyed.   The  entire  program  becomes  very  simple 
and  the  efficiency  depends  chiefly  on  the  number  of  logical  branches. 

If  the  table  for  p  is  too  large  to  store  in  one  PE  (or  too  uneco- 
nomical), it  must  be  distributed  among  PEs .   In  this  case,  the  correspondence 
of  the  hardware  structure  (PEs  and  their  memory)  to  the  geometrical  struc- 
ture of  the  medium  (p  table)  becomes  more  explicit.   Let  the  coordinate 
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system  attached  to  the  medium  be  called  the  "reference  coordinates".   As 
an  example,  if  the  X  coordinate  of  the  reference  corresponds  to  the  loca- 
tion of  PE  memory  and  the  Y  coordinate  to  PE  number  then  both  structures 
match  each  other.  (Straight  storage  case)   As  will  be  discussed  later 
there  are  some  other  ways  of  storing  p's,  in  which  the  correspondence 
becomes  more  obscured.   As  for  the  particles,  there  are  two  different  basic 
points  of  view  which  consequently  lead  to  two  different  program  schemes. 

The  first  scheme  is  to  fix  the  reference  coordinates  to  PEs  so 
that  particles  are  transferred  to  the  neighboring  PEs  as  they  move.   In 
other  words,  each  PE  is  assigned  within  a  certain  region  of  the  entire 
medium.   This  view  is  similar  to  the  "Eulerian  View"  in  hydrodynamics  and 
will  be  called  a  "Particle  Migration  Scheme"  in  the  following  sections. 
The  second  scheme,  on  the  other  hand,  is  that  each  PE  contains  a  certain 
number  of  particles  which  will  be  handled  in  it  until  they  are  destroyed. 
Therefore,  the  new  data  on  the  medium  are  fetched  to  a  PE  in  which  a  par- 
ticle is  stored  that  requires  them.   In  other  words,  if  one's  eyes  are 
fixed  on  a  PE  (or  the  hardware  structure)  a  particle  stays  in  it  and  the 
reference  coordinates  move  in  to  the  particle.   One  confusing  thing  about 
this  notion  is  that  the  motion  of  the  reference  coordinates  varies  with  par- 
ticles.  This  is  to  be  called  the  "Coordinate  Migration  Scheme".   The 
Lagrangean  formulation  in  hydrodynamics  is  somewhat  similar  to  this  scheme. 
Comparisons  of  the  two  schemes  will  be  made  in  the  next  sections. 
3.1.2   Particle  Migration  Scheme 

As  has  been  discussed  briefly  in  the  beginning  of  this  chapter, 
particles  actually  migrate  to  the  neighboring  PE  as  they  cross  cell 
boundaries.   This  is  most  efficiently  performed  by  using  ROUTE  instruction. 
The  merit  of  this  scheme  is  that  the  destinations  of  routing  are  limited  to 
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very  few  PEs.   This  makes  a  program  work  very  efficiently.   (Some 
statistical  results  for  a  simple  test  program  will  be  discussed  in  Chapter 
5).   There  are  several  ways  of  assigning  PEs  to  the  parts  of  the  medium. 
The  simplest  one  is  shown  in  Figure  3.3-a.   In  this  example,  each  PE  is 
assigned  with  a  long  strip  of  the  domain,  i.e.,  the  Y  direction  corresponds 
to  processor  number,  whereas  the  X  direction  corresponds  to  the  location 
in  the  memory  of  each  PE.   If  the  geometrical  structure  of  the  material  is 
cylindrical  and  the  source  is  located  at  its  center,  this  configuration 
can  be  adopted.   The  advantage  is  that  the  routing  distances  are  +1 ,  -1, 
or  0  so  that  the  program  works  very  efficiently.   In  the  example  of  Figure 
3.3-a,  the  displacement  of  particles  in  the  X  direction  does  not  involve 
routing.   There  is  a  disadvantage  in  the  straight  storage  if  many  par- 
ticles escape  outward  from  the  edge  of  the  medium.   Then  only  PE63  and 
PEO  will  have  to  take  care  of  those  which  escape  in  positive  and  negative 
y  directions  respectively  after  turning  off  all  other  PEs.   This  disparity 
on  the  boundary  can  be  solved  by  skewing  the  medium  as  shown  in  Figure  3.3-b. 
It  is  obvious  that  the  boundary  is  almost  equally  distributed  but  there  are 
now  5  different  routing  distances  possible.   There  is  also  a  problem  that 
the  initial  allocation  of  data  on  density  becomes  more  complicated.   (Fur- 
ther comparison  of  both  straight  and  skewed  storage  will  be  discussed  in 
Chapter  5). 

If  there  is  no  cylindrical  symmetry,  the  best  way  to  distribute 
the  boundary  equally  is  to  form  8x8  squares  (Checkerboard  storage). 
(Fig.  3.3-c,d).   In  these  cases,  there  are  9  different  routing  distances 
each  of  which  corresponds  to  a  neighbor  of  a  cell  or  a  cell  itself.   If  each 
particle  carries  a  lot  of  information  with  it  which  has  to  be  routed  to  its 
destination,  the  increase  in  routing  deteriorates.   The  boundary  problem 
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arises  in  this  scheme  too.   In  Figure  3. 3-d  the  boundary  is  more  equally 
distributed  than  in  Figure  3.3-c. 

The  conclusion  is  that  the  way  PEs  should  be  assigned  cells 
is  strongly  dependent  on  what  type  of  problem  is  to  be  solved  and  cannot  be 
determined  a  priori. 

Reducing  the  variables  characterizing  each  particle  is  as  important 
as  reducing  the  number  of  different  route  distances.   This  will  be  discussed 
and  illustrated  in  detail  in  Section  3.2. 
3.1.3   Coordinate  Migration  Method 

In  contrast  to  the  particle  migration  method,  particles  stay  in 
their  PEs  all  through  their  lives.   This  method  involves  a  table  lookup 
problem.   As  time  goes  on,  particles  diffuse  over  the  medium.   They  will 
require  the  data  on  the  density  stored  in  various  PEs.   Generally,  this  method 
is  not  applicable  in  practical  cases.   But  when  the  table  size  describing 
the  density  is  small  enough  to  store  entirely  in  a  few  PEs  (2^4) ,  then  the 
routing  distance  is  limited  and  this  method  is  usable.   The  routing  proce- 
dure for  the  "reduced  density"  is  simpler  than  the  particle  migration  method 
in  general,  because  the  number  of  data  to  be  routed  is  usually  smaller.   As 
this  method  was  not  adopted  in  the  program  shown  in  Chapter  4  no  further 
detail  will  be  discussed. 
3. 2   Particle  Migration  Scheme 
3.2.1  Particle  Tables 

Each  "particle"  in  the  table  consists  of  such  data  as  cell  number, 
(N) ,  relative  coordinates  within  a  cell,  (X,Y),  direction  cosine,  (u) , 
azimuthal  angle,  (sin  <j> ,  cos  <$>) ,  energy,  (E) ,  fraction  of  time  interval 
At,  (K) ,  survival  distance,  (t),  weighting  factor,  (W) ,  and  pointers.   All 
except  the  last  one  are  physical  quantities  associated  with  a  particle,  which 
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have  been  explained  in  Chapter  1.  Pointers,  on  the  other  hand,  are  for 
higher  efficiency  of  computation.  Certain  techniques  yield  less  memory 
requirements  and  faster  computation. 

3.2.2  Packing  Data  and  Integer-Real  Conversion 

The  memory  size  problem  is  very  serious  because  each  PE  has  only 
2048  words.   Therefore,  those  previously  mentioned  data  must  be  packed  as 
densely  as  possible.   Fortunately,  GLYPNIR  enables  any  part  of  a  word  to 
be  loaded  to  a  register  as  an  integer.   If  16  bits  are  enough  for  an  integer, 
4  such  data  can  be  packed  into  a  full  word.   Real  numbers,  however,  can  be 
only  full-sized  (48  bit  mantissa)  or  half-sized  (24  bit  mantissa).   It  is 
possible  to  store  a  real  number  in  the  form  of  an  integer  if  it  has  some 
upper  limit.   Such  cases  are:   direction  cosine,  azimuthal  angle,  (|u|  , 
|  sin  <j>  |  ,  |  cos  <j)|<l)  and  relative  coordinates  (0<x,  y<l).   To  accomplish 

this,  the  following  rules  were  used:   to  store  a  real  number,  multiply  it 

p 
by  2   (p  =  15 ,  for  example)  and  store  the  obtained  integer  into  some  location; 

to  read  it  out  and  compute  with  it,  load  it  to  a  register  as  an  integer, 

shift  it  to  the  left  end  of  the  mantissa  part,  and  change  the  exponent  part. 

With  the  combination  of  these  techniques  all  the  data  on  a  particle  are 

packed  into  4  full  words  or  less.   (Fig.  3.4). 

3.2.3  Chain  Structure 

The  efficiency  of  computation  is  greatly  improved  by  linking 
particles  of  the  same  characteristics  (e.g.  those  to  be  scattered)  to  form 
a  chain.   The  pointers  which  have  been  mentioned  in  the  beginning  of  this 
chapter  are  for  this  purpose.   Figure  3.5  shows  how  a  pointer  works.   If 
particles  of  particular  interest  are  not  linked,  all  the  particles  in  the 
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table  must  be  scanned.   (Fig.  3.5-a). 

On  the  other  hand  a  pointer-linked  table  can  be  scanned  efficiently. 

Suppose  the  table  size  is  M.  I.    is  the  length  of  the  chain  in  the  ith  PE 

and  aM  is  the  expectation  value  of  I.    (i  =  0,  1,  .  .  .63).   Then  the  number 

of  steps  to  scan  the  table  is  ^  =  max  I. 

0<i<63 

The  distinction  between  the  linked  and  the  unlinked  is  more  obvious 

if  the  entire  table  (M)  consists  of  several,  say  N,  subsets  and  any  particle 

belongs  to  one  of  them  and  only  one  of  them.   The  number  of  steps  is  N  x  M 

to  scan  the  unlinked  table,  (because  the  table  must  be  scanned  N  times), 

while  the  number  of  steps  to  scan  the  linked  table  is: 
N 
L   =   I   Max     {I.     (I))  <_   N  x  M 
1=1  0<i£63  1 

See  appendix  1  for  mathematical  derivation  and  an  example.   An 
illustrated  example  is  also  given  in  Figure  3.6. 

Pointers  can  be  either  included  in  particle  tables  or  separated 
from  them.   The  first  method  is  good  if  there  are  only  a  few  pointers. 
(Such  is  the  case  of  the  program  listed  in  this  paper).   But  the  second 
method  is  better  if  there  are  many  (^8)  pointers  associated  with  such  par- 
ticles; they  need  not  be  routed  when  a  particle  crosses  cell  boundary. 

Suppose  that  every  processor  contains  100  particles  each  of  which 
consists  of  4  words.   The  pointer  must  take  values  0,  4,  8  .  .  .396  (<512). 
Therefore,  8  pointers  can  be  packed  into  a  full  word  (64  bits).   (Note  that 
8  bits  are  enough  for  each  of  them). 
3.2.4  Material  Table 

The  material  table  stores  data  such  as  energy  (temperature),  density, 
specific  heat  of  64  x  32  cells,  besides  tables  of  absorption  and  scattering 
coefficients.   Each  cell  is  assigned  with  two  full  words  and  data  are  packed 
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in  the  same  way  as  before,  some  as  integers,  some  as  real  numbers. 

If  energy  of  newly  created  particles  at  each  time  step  is  to  be 
sampled  from  a  Planckian  population,  its  interpolation  table  must  be  stored 
too. 

3.2.5  I/O 

Because  of  the  limitation  of  memory  size,  only  a  portion  of  large 
particle  tables  can  be  stored  in  the  PE  memories.   Therefore,  they  must  be 
stored  in  disc  memory  after  being  processed  sufficiently  and  a  new  particle 
table  must  be  brought  into  the  PE  memories.   It  is  not  efficient,  however, 
to  stop  processing  while  I/O  is  being  done.   A  better  way  is  to  divide  each 
memory  into  three  large  blocks  titled,  NEW,  OLD,  PROCESS,  which  corresponds 
to  the  area  into  which  a  new  particle  table  is  being  read,  the  area  from 
which  a  particle  table  is  stored  into  disc  and  the  area  of  currently  pro- 
cessed particle  tables,  respectively.   NEW  becomes  PROCESS,  PROCESS  becomes 
OLD,  OLD  becomes  NEW  in  the  next  phase  and  so  on.   (Fig.  3.7). 

3.2.6  Total  Capacity  of  Particles 

The  capacity  of  particles  in  the  simple  case  when  the  medium  con- 
sists of  64  x  32  cells  can  be  calculated  by  taking  all  the  above  things  into 
consideration: 

Particles   Medium 
.a. 


where 

N   :   the  number  of  particle  table  entries 

E   :   table  for  random  sampling  (Planckian)  +  table  for  absorp- 
tion and  scattering  coefficients  -  50  (non-monochromatic) 

0  (monochromatic) 
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F  :   instructions  and  GLYPNIR  system  -    200 

N  =  1734/12  k   144 
64N  =   9200 
Therefore,  about  9200  entries  are  available  for  particles 
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Figure  3.1   Some  Examples  of  Geometrical  Structure 
of  the  Medium  and  the  Corresponding  Cell  Number  Assignment 
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Figure   3.2      Comparison  of   Two  Different    Schemes 
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Figure  3.3   Some  Examples  of  PE  Assignment 
in  the  Particle  Migration  Scheme 
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Figure  3.3-c   Some  Examples  of  PE  Assignment  in  the 
Particle  Migration  Scheme  (continued)  -  Straight  Checkerboard  Storage 
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Figure  3. 3-d   Some  Examples  of  PE  Assignment  in  the 
Particle  Migration  Scheme  (continued)  -  Skewed  Checkerboard  Storage 
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Figure  3.4  An  Example  of  Particle  Table 
(used  in  the  test  program  in  Appendix  3) 
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Figure  3.5   Comparison  of  a  Chain-linked  Table 
and  a  Usual  Table 
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4.   A  TEST  PROGRAM  FOR  ILLIAC  IV 

This  chapter  is  concerned  with  the  description  of  the  program 
which  was  written  in  GLYPNIR  (Ref.  [8])  and  was  run  on  simulator  SSK. 
(Ref.  [17]).   It  is  given  in  Appendix  3.   This  program  was  written  to  test 
the  feasibility  of  parallel  computation  for  this  problem  and  also  to  serve 
as  an  example  for  more  detailed  programs. 

The  simplifications  which  were  made  for  this  test  program  were 
discussed  in  Section  1.3.   A  flow  chart  is  in  Figure  4.1. 

The  whole  program  consists  of  about  20  small  subroutines,  names 
of  which  are  given  in  Table  4.1. 
4. 1   Description  of  Major  Subroutines 

DATALOAD — All  the  numerical  quantities  which  are  necessary  to  ini- 
tialize the  programs  are  read  in.   These  are: 

(1)  initial  random  numbers 

(2)  the  length  of  one  time  step  (At) 

(3)  mass  absorption  coefficient  (oabs) 

(4)  mass  scattering  coefficient  (oscatt) 

(5)  density  distribution 

(6)  specific  heat  distribution 

(7)  initial  temperature  distribution 

CHAIN — This  subroutine  is  to  link  the  table  entries  of  some  common 
characteristics  (such  as  scattering,  routing,  etc.)  to  form  a  chain. 
It  was  briefly  mentioned  in  Chapter  3.   It  functions  as  follows: 
Suppose  the  Nth  table  entry  is  to  be  linked  to  the  chain  for  scattering, 
First,  the  pointer  part  of  the  Nth  entry  is  replaced  by  the  contents 
of  a  register  which  always  keeps  the  location  of  the  edge  of  the 
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chain  for  scattering.   Then  N  (which  is  the  most  up-to-date  location 
of  the  edge)  is  stored  into  the  register. 

Due  to  the  necessity  for  handling  two  different  pointers  by  one 
subroutine,  indexing  facility  is  also  included.   Namely,  either  of 
two  pointer  parts  of  a  particle  table  entry  can  be  specified  by 
changing  an  input  parameter  Q.   Q  =  0  for  scattering  and  vacancy, 

whereas  Q  =  1  for  routing. 

15 
CVTRL — This  subroutine  divides  a  15  bit  integer  by  2   without 

using  division.   It  was  also  discussed  in  Chapter  3. 

RNDX — This  is  a  random  number  generator  which  generates  uniformly  dis- 
tributed random  numbers  by  the  linear  congruential  method.   See  Chap- 
ter 2  for  detail.   Initial  random  numbers  are  read  into  PEs  at  the 
beginning  of  the  program. 

ENERGYBALANCE — This  subroutine  calculates  the  energy  to  be  emitted  by 
radiation  in  each  cell.   The  total  radiative  energy  is  summed  up  to 
assign  the  number  of  particles  in  the  next  subroutine. 
PARTICLEASSIGN — The  number  of  new  particles  in  each  cell  of  the  medium 
is  determined.   It  is  proportional  to  the  radiative  energy  in  the  cell. 
To  avoid  the  overflow  problem  of  particle  table,  only  the  two-thirds 
of  the  available  vacancy  in  PE  is  used  to  create  new  particles.   The 
energy  per  "particle"  is  merely  (the  radiative  energy  per  cell) /(the 
number  of  particles  in  the  cell).   That  is,  the  energy  is  monochromatic. 

The  whole  program  has  been  made  so  that  the  radiation  process 
can  be  extended  in  the  future  to  non-monochromatic  case.   The  weighting 
factor  in  the  particle  table,  for  example,  is  meant  for  this  purpose. 
If  the  radiation  is  not  monochromatic,  tables  for  absorption  and  scat- 
tering coefficients  as  well  as  data  for  Planckian  distribution  must  be 
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stored  in  PE  memory. 

BIRTH — New  particles  are  created  in  this  subroutine  according  to 
Section  1.3. 

TRANSLATE — This  subroutine  has  two  functions.   One  is  to  translate 
particles  to  their  new  locations.   The  other  is  to  detect  those  par- 
ticles which  cross  the  cell  boundaries  and  those  which  escape  from 
the  medium. 

ABSORB — This  subroutine  is  called  when  a  particle  is  absorbed  by  the 
medium.   The  particle  is  destroyed  and  its  table  entry  is  linked  to 
the  VACANCY  chain. 

ESCAPE — This  subroutine  is  called  when  a  particle  escapes  from  the 
medium.   The  particle  is  also  destroyed,  its  table  entry  is  linked 
to  the  VACANCY  chain,  and  ESCAPE  counter  is  incremented  by  its  weight. 
BRANCH — This  subroutine  calls  TRANSLATE,  decides  interaction  types  of 
particles,  then  calls  ABSORB,  ESCAPE  to  process  particles.   Those  which 
are  to  be  scattered  are  not  processed  here  but  linked  to  form  a  chain. 
This  is  because  the  scattering  process  is  very  long  and  it  is  desirable 
to  process  them  in  parallel  at  a  later  time.   Those  particles  which 
cross  the  cell  boundaries  toward  the  y-direction  are  also  linked  for 
routing.   This  subroutine  is  called  both  in  the  main  program  and 
SCATTER. 

ANGLE — New  sets  of  angles  after  Thomson  scattering  are  calculated. 
The  algorithm  to  sample  the  scattering  angles  was  described  in  Section 
2.2.   New  angles  are  then  obtained  by  using  spherical  trigonometric 
formulae  in  Section  1.1. 

SCATTER — Particles  in  the  chain  for  scattering  are  processed  after 
the  table  scanning  is  finished  in  the  main  program.   This  subroutine 
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calls  ANGLE,  BRANCH,  STORE,  and  functions  almost  in  the  same  way  as 

the  main  program  does  to  scan  the  table. 

ROUTE — Particles  in  the  chains  for  routing  are  actually  transferred 

to  their  destinations  in  this  subroutine.   Index  K  =  1  specifies  the 

routing  to  the  right  and  K  =  -1  specifies  the  routing  to  the  left. 

The  vacancy  which  has  just  been  made  is  linked  to  the  VACANCY 

chain, 

TALLY — This  subroutine  is  used  to  sum  up  the  intensity  of  radiation 

in  each  cell.   The  summed  intensity  is  necessary  in  ENERGYBALANCE  to 

calculate  the  new  temperature. 
4.2  Main  Program 

The  main  program  contains  all  the  global  variable  and  subroutines 
and  calls  the  subroutines  in  the  sequence  given  in  Figure  4.1. 

The  main  program  starts  with  the  reservation  of  blocks  for  PCPOINT- 
ERS.   The  constants  and  counters  are  initialized.   All  the  particle  table 
entries  are  linked  as  vacancy.   A  large  loop  for  iterations  is  entered, 
one  step  of  which  corresponds  to  the  time  interval  At.   The  radiative 
energy  computed  in  ENERGYBALANCE  decides  the  particle  distribution  in  PAR- 
TICLEASSIGN.   BIRTH  follows  it  to  create  new  particles  in  the  vacancy.   Then 
the  table  scanning  starts,  (this  is  a  sequential  scanning  from  the  top  to 
the  bottom),  during  which  the  coordinates  of  particles  are  updated,  the 
types  of  interactions  are  determined,  and  so  on.   After  all  the  entries  of  the 
particle  table  are  scanned,  the  main  program  proceeds  to  the  scattering 
process  SCATTER,  which  is  followed  by  ROUTE  and  TALLY.   This  completes  one 
time  cycle  and  the  intermediate  results  may  be  printed  out  by  TABLEPRNT. 
The  entire  program  consists  of  more  than  800  cards  in  source  code  (including 
comment  cards)  and  is  compiled  into  more  than  5,500  instruction  syllables 
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of  assembly  codes.   About  220  words  of  PE  memory  in  every  PE  are  occupied 
by  these  instructions. 


Table  4.1 


Name 

Function 

Type 

Called  By 

ABSORB 

Absorption 

BRANCH 

ANGLE 

New  Scattering  Angles 

SCATTER 

BIRTH 

Creat  New  Particles 

MAIN 

BRANCH 

Interaction  Type 

MAIN 

CHAIN 

Table  Link 

PARTICLEASSIGN, 
BIRTH,  BRANCH, 
SCATTER,  TRANS- 
LATE, MAIN 

COS 

Cosine 

PREAL 

BRANCH 

CVTRL 

Integer>Real 

PREAL 

BRANCH 

DATALOAD 

Initial  Data 

MAIN 

DSTBND 

Distance  to  Boundary 

BRANCH 

ESCAPE 

Count  the  Number 

of  Escapes 

BRANCH 

ENERGYBALANCE 

New  Energy 

MAIN 

EXP 

Exponential 

PREAL 

LN 

Natural  Log 

PREAL 

BIRTH,  SCATTER 

MOD 

Mod  (64) 

PINT 

BIRTH,  TRANS 
LATE 

PARTICLEASSIGN 

RNDX 
ROUTE 
SCATTER 
SIN 


Number  of  Particles 
to  be  Created 

Random  Number         PREAL 

Routing  to  Neighbors 

Scattering 

Sine  PREAL 


MAIN 

BIRTH,  ANGLE 

MAIN 

MAIN 

BIRTH,  ANGLE 
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Table  4.1   (con't) 

Name 
SQRT 
STORE 


TABLEPRNT 


TALLY 


Function 
Square  Root 


Store  to  Particle 
Table 

Print  out  the  Contents 
of  Tables 

Intensity  of  Radiation 
in  each  Cell 


Type      Called  By 
PREAL  BIRTH,  ANGLE 
MAIN,  SCATTER 

MAIN 


MAIN 


SUBROUTINE    NAME 


DATA   LOAD 


48 


NEXT  TIME  STEP 


NEXT   TABLE   ENTRY 


ENERGY   BALANCE 


PARTICLE   ASSIGN 


BIRTH 


BRANCH 


TRANSLATE 


ESCAPE 


(g 


TERING 
INK 


5 


f    ROUTE    'N 
^      LINK     J 


STORE 


SCATTER 


[    ANGLE    ] 

' 

' 

[    BRANCH  | 

' 

' 

ROUTE 


TALLY 


Figure   4.1      A  Flow   Chart   of    the   Test    Program 
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5.   SOME  COMPUTATIONAL  RESULTS 

Some  of  the  computational  results  are  discussed  in  this  chapter. 
Since  the  simulator  works  very  slowly,  it  is  not  practical  at  the  present 
time  to  run  the  whole  program  for  many  time  steps.   Therefore,  emphasis 
was  put  on  the  statistical  investigation  of  PE  efficiency  aspects.   Some 
factors  investigated  here  are: 

(1)  cell  size  vs  velocity 

(2)  effect  of  multiple  scattering 

(3)  effect  caused  by  using  modified  probability  density  function 
for  the  survival  distance  (t) 

(4)  effect  of  skewed  storage 

All  the  test  programs  have  been  run  under  the  following  assump- 
tions: 

(1)  the  material  is  homogeneous 

(2)  no  absorption  is  involved 

(3)  scattering  cross  section  is  a  constant  (^  ) 

(4)  scattering  is  isotropic 

(5)  mesh  size  is  64  x  8  except  one  case  (64  x  64) 

5.1  Determination  of  Cell  Size  (AX)  and  Time  Interval  (At)  and  Their 
Relation  to  the  Rate  of  Boundary-Crossing 

It  is  important  to  know  how  many  particles  involve  routing.  This 
rate  is  predicted  to  be  dependent  on  vAt/AX.  Figure  5.1  shows  some  compu- 
tational results.  Sixteen  particles  are  created  in  each  PE  at  the  beginning 
of  the  time  interval  At.  Scattering  cross  section  is  taken  as  a  parameter. 
The  rate  grows  quite  linearly  with  vAt/AX.  This  means  that  vAt/AX  does  not 
affect  the  efficiency  as  far  as  routing  is  concerned;  if  At  is  doubled  then 
the  number  of  time  steps  is  halved.   The  rate  decreases  somewhat  with  the 
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increase  of  a   because  some  of  the  particles  which  have  crossed  boundaries 
are  scattered  back  to  their  original  cells. 

The  cell  size  AX  must  be  determined  uniquely  by  how  the  density 
(p)  of  the  medium  varies  with  position.   If  p  changes  very  slowly,  each 
PE  can  cover  a  wider  range  (i.e.,  large  AX)  and  vice  versa. 

At  has  an  upper  limit;   no  particle  can  travel  through  two  or 
more  cells  in  one  time  step,  i.e.,  vAt<AX.   Within  this  limit,  At  can  be 
chosen  arbitrarily.   Small  At  will  bring  more  accurate  results  on  the 
temperature  and  the  intensity,  but  the  number  of  time  steps  increases. 
Large  At,  on  the  other  hand,  will  bring  more  crude  results  in  shorter  time. 
In  this  case,  more  routing  is  involved  in  each  time  step.   Refer  to  Section 
5.5.4  for  a  related  topic. 
5. 2   Effect  of  Multiple  Scattering 

If  a  particle  experiences  more  than  one  scattering  in  At,  the 

efficiency  of  parallelism  obviously  decreases.   Therefore,  it  is  important 

to  see  the  relation  of  multiple  scattering  to  scattering  cross  section  a 

and  &=VAt .   The  process  is  simply  a  Poisson  process  if  a  is  constant. 

Therefore,  the  probability  that  a  particle  is  scattered  n  times  while  it 

travels  the  distance  £  is:   Pn(£)  =   (q&)   e    .   (Derivation  is  in  Appen- 

n! 

dix  2).   Figure  5.2  shows  the  theoretical  curve  (solid  lines)  and  the  result 
of  computation.   It  is  obvious  that  the  number  of  multiple  scattering  in- 
creases with  I.      At  i=l   about  20%  of  the  particles  are  scattered  twice  or  more, 
Multiple  scattering  is  treated  just  as  usual  scattering  in  the  program 
described  in  Chapter  4:   if  a  particle  is  re-scattered  after  a  scattering, 
then  it  is  linked  again  to  the  top  of  "scatter  chain".   Therefore,  it  is 
processed  together  with  some  others  which  are  located  in  different  PEs  and 
are  to  be  scattered  for  the  first  time. 
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It  is  not  a  good  idea  to  enable  PEs  for  only  the  second  scattering 
after  each  scattering,  then  to  switch  to  the  third  and  so  on.  .  .  .   Statis- 
tical fluctuation  would  work  in  the  most  unfavorable  way. 

5. 3  An  Effect  Caused  by  Modifying  a  Probability  Distribution  Function 
for  Survival  Distances  of  Particles 

An  effort  was  also  made  to  modify  the  probability  density  function 
for  survival  distance  to  reduce  the  rate  of  multiple  scattering.   One  method 
is  simply  to  cut  off  the  density  function  below  some  value  l=a.      This  results 
in  shifting  all  Pn(x)s  (n>l)  to  the  right  by  a  with  the  exception  that 
Po(x)=l   (x>a) .  If  a  is  chosen  to  be  vAt,  then  there  will  be  no  multiscat- 
tering  at  all.   But  this  is  too  unnatural.   The  second  method  investigated 
gives  a  linear  growth  up  to  x=a  and  exponential  decay.   (Fig.  5.3). 
Starting  at  x=0,  each  curve  seems  to  be  located  between  the  one  sampled 

—  *r  —  I  T  — 3.  ) 

from  P(x)=e   and  the  one  from  P(x)=e      ,  but  it  soon  exceeds  the  former. 
There  is  an  unusual  increase  due  to  the  sharp  peak  in  the  function  P(x) 
at  x=a.   For  smaller  x,  however,  the  rate  of  scattering  is  reduced.   Some 
order  of  the  Erlang  distribution  will  give  a  smoother  result. 

5.4  Skewed  Cell  Boundaries  Vs  Straight  Cell  Boundaries 

It  was  stated  in  Chapter  3  that  the  skewed  cell  boundaries  smooth 
out  the  distribution  of  the  number  of  particles  which  escape  from  the  medium. 
The  ideal  case  is  when  the  mesh  size  of  the  medium  is  64  x  64;  every  PE 
has  the  same  number  of  edges.  (Fig.  3.3-b).   This  scheme,  however,  increases 
routing.   Therefore,  it  is  necessary  to  check  which  method  is  better.   Sev- 
eral tests  have  been  made  to  compare  these  two  schemes. 

The  mesh  sizes  for  testing  were  64  x  8  and  64  x  64;  in  each  case 
the  same  data  were  interpreated  in  two  ways;  skewed  case  and  straight 
case.   Results  are  shown  in  Table  5.1  and  Figures  5.4,  5.5.   The  conditions 
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for  numerical  experiments  are  the  same  as  before:   isotropic  scattering, 
constant  a  ,  and  each  particle  travels  for  the  time  At.   Ax=l.   a  and 
velocity  are  specified  in  each  case. 
5.4.1   64  x  8  Cells 

Data  were  taken  for  a  =  «4,  velocity=l,  and  a=.2,  velocity=.5. 
Ten  particles  were  created  in  each  cell  at  the  beginning  of  a  time  step  and 
they  were  all  destroyed  at  the  end  of  this  time  step.   This  process  was 
repeated  5  times  and  the  contents  of  the  counters  were  accumulated.   There- 
fore, effectively  40  particles  were  examined  in  each  cell  (not  in  each  PE) . 
Unnecessary  subroutines  were  removed.   In  Figures  3.3-a  and  3.3-c,  there  is  a 
peak  which  is  due  to  the  PEs  corresponding  to  the  edges  of  the  medium.   These 
peaks  were  fairly  smoothed  out  in  the  skewed  cases  shown  in  Figures  3.3-b  and 
3. 3-d.   As  for  routing,  the  distributions  turned  out  to  be  Gaussian-like  in 
all  examples.   Obviously,  more  routing  is  involved  in  the  skewed  storage.   Also 
shown  are  the  distributions  of  the  double  routing  in  skewed  storage.   That  is, 
the  number  of  particles  which  have  to  be  routed  twice  to  get  to  their 
destinations.   (Fig.  3.3-a). 

Table  5.1  is  the  summary  of  these  numerical  experiments.   The 
numbers  "max-min"  in  the  first  column  of  each  experiment  mean   the  maximum  and 

minimum  numbers  of  particles  among  PEs,  respectively,  which  escaped  from  the 
medium  or  crossed  the  cell  boundaries.   "PE  average  %"  means  the  average 

percentage  of  enabled  PEs.   This  was  obtained  from: 

63 
I  (counter  (PE)) 

PE=0 

PE  average  %  = 

max  x  64 

For  example,  (49-4,  24%)  in  the  first  row  means  that  there  were  49  particles 

escaped  from  one  PE  whereas  there  were  only  4  particles  from  another  PE. 

All  others  are  between  these  two  values.   Therefore,  49  steps  are  taken  to 
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process  these  particles  in  parallel,  and  24%  of  PEs  are  enabled  on  the 

average. 

5.4.2   64  x  64  Cells 

Data  were  taken  for  a=.4,  velocity=l,  and  a=.2,  velocity=.5. 
Only  the  first  example  is  shown  in  Figure  5.5.   Only  one  particle  was 
created  in  each  cell  at  the  beginning  of  a  time  step  and  they  were  destroyed 
at  the  end  of  this  time  step.   This  process  was  repeated  5  times  so  that 
5  particles  were  examined  in  each  cell  effectively.   Obviously  the  number 
of  escapes  is  smoothed  out  across  PEs.   The  lower  half  of  Table  5.1  can  be 
read  in  the  same  way  as  described  in  Section  5.4.1.   The  conclusion  is 
that  the  skewing  does  not  always  bring  more  favorable  results  than  the 
ordinary  storage. 

It  is  true  that  the  particles  that  escape  from  the  medium  are  very 
well  distributed  across  PEs,  but  the  increase  of  the  number  of  routing  is 
so  big  that  it  cancels  out  this  merit.   Therefore,  the  skewing  is  better 
than  the  ordinary  storage  only  in  limited  cases  such  as  when  ESCAPE  takes  a 
much  longer  time  than  ROUTE. 
5 .5   Concluding  Remarks 

The  program  which  is  described  in  this  document  is  a  very  simple 
one  and  expandable  to  more  complicated  cases.   It  is  inevitable,  however, 
that  the  efficiency  goes  down  as  the  program  becomes  more  complicated. 

There  are  several  problems  which  are  left  to  be  solved. 
5.5.1  Problem  of  Table-Overflow  and  Normalization 

As  was  discussed  in  Chapter  3,  the  particle  migration  scheme  has 
been  adopted  to  obtain  higher  efficiency  in  parallel  computation.   As  a  con- 
sequence, however,  particles  must  be  physically  transferred  to  other  PEs. 
If  many  particles  flow  into  one  PE,  it  will  not  be  able  to  accept  all  of 
them.   To  encounter  this  problem,  certain  number  of  table  entries  must  be 
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saved  in  PEs  for  those  visitors. 

Cells  must  be  well  distributed  among  PEs  so  that  the  outgoing 
particles  cancel  the  incoming  particles  on  the  average;   if  many  par- 
ticles flowed  into  one  PE,  it  would  not  be  able  to  accept  all  of  the  par- 
ticles.  This  is  an  extreme  case,  but  a  certain  number  of  table  entries 
still  must  be  saved  in  PEs  as  buffers  for  those  visitors.   There  is  a 
technique  called  "Russian  Roulette"  in  which  incoming  particles  are  rejected 
with  probability  p  and  accepted  with  the  probability  1-p.   In  the  latter 
case,  the  weighting  factors  are  multiplied  by  l/(l-p).   This  is  one  of  the 
typical  techniques  to  reduce  variances.   Special  care  must  be  taken,  however, 
when  it  is  used. 

5.5.2  Problem  of  Word  Size  (alternative  format) 

When  the  program  was  originally  implemented,  an  emphasis  was  put 
on  the  capacity  of  the  particle  table.   Consequently,  only  16  bits  were  assigned 
to  those  words  which  range  in  [-1,1].   But  if  more  accuracy  is  needed  on 
those  quantities,  three  of  them  can  be  packed  into  one  full  word.   In  this 
case,  each  word  has  21  bits,  which  is  comparable  to  the  fraction  part  of  a 
half-word  (24  bits).   An  example  of  word  format  is  given  in  Figure  5.6. 

5.5.3  About  the  Flexibility  of  Logical  Network  Which  Interconnects  PEs 

In  the  problems  which  involve  probabilistic  nature,  it  is  highly 
desirable  that  the  interconnection  of  PEs  be  as  flexible  as  possible.   A 
good  example  is  the  table  look  up  problem  which  was  discussed  in  Chapter  2. 
If  the  interconnection  logic  is  programmable  by  users  it  will  be  much  easier 
to  handle  more  complicated  problems.   This  may  be  one  way  parallel  computers 
should  be  organized. 

5.5.4  Variable  Contraction  Ratio  of  Cell  Sizes 

It  is  possible  to  change  the  cell  sizes  in  X  direction,  Y  direction, 
or  both.   Take  Y  direction,  for  example.   If  the  density  gradient  is  not 
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uniform  in  Y  direction,  it  is  better  to  adjust  Ay  so  that  it  covers  a 
wider  range  when  the  density  varies  slowly  and  a  narrower  range  when  the 
density  varies  rapidly  in  that  direction.   The  reduction  ratio,  a.,  in 
each  row  of  cells  is  stored  as  a  PE  variable  and  the  displacement  of  par- 
ticles in  Y  direction  is  modified  as 

; 2 

Y  ■<-  Y  +  a.  V  At  vl-i4   *  sin  d> 
1 

when  the  particles  are  in  the  ith  row  of  cells. 
Survival  distance  and  the  displacement  in  X  direction  are  not  changed  in 
this  example. 


Table   5.1 
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Figure  5.1  Percentage  of  Particles  Which 
Have  Crossed  Cell  Boundaries  Vs  Velocity 
s:   parameter  (isotropic  scattering) 
Ax=l,  At=l   1024  particles 
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Figure   5.2     Multiple    Scattering   During 
At=l  vs   as£=asvAt    (isotropic   scattering) 
(Solid   curves   are   Pn(£)=    (afiJQn     e~°  sl     n=o,l,2,3) 

n! 
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PARTICLES 


p(r) 
1/.2E 


A.  LINEAR 


W=.2 


Figure  5.3  An  Example  of  the  Modified  P(T)  with  W=.2 
(above)  and  the  Computational  Results  (below) 
(Solid  Curves  are  Pn  (£)=  (ol)n  e~al   n=0,l,2,3) 
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Figure  5.3  (continued)   Comparison  of  Several  P(t)s 
with  Different  Ws  (W=0, .2 , .4 , .6) 
(Only  the  Results  for  N=l  are  Displayed) 
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APPENDIX  1 

As  is  discussed  in  Chapter  3,  it  is  interesting  to  investigate 
the  probability  density  functions  of  the  maximum  length  of  chain-linked 
tables  among  PEs.   Let  the  problem  be  stated  as  follows:   There  are  N  PEs 
and  each  PE  has  a  table  of  M  entries  each  of  which  is  selected  with  .the 
probability  a(0<a<l)  so  that  the  average  length  of  the  chains  is  CM.   What 
is  the  distribution  of  the  maximum  length  of  the  chains?   Assume  that  M 
is  a  large  number  and  a  is  neither  too  small  nor  too  close  to  1. 

Theorem.   If  X  ,  X  have  the  distribution  functions  F  (X) , 
F9 (X)  respectively,  the  distribution  function  of  the  stochastic  variable 
X  =  max  (X  ,  X2)  is  F(X)  =  F  (X)F2(X). 

Proof.   Let  P, (X) ,  P2(X)  and  P(X)  denote  the  probability  density 
functions  of  X  ,  X9 ,  and  X,  respectively.   Then, 

P(X)dX  =  P   (X)dx  /X  P  (X')dX'  +  P2(X)dX  J*X  P  (X')dX' 

=  P   (X)F2(X)dX  +  P2(X)F1(X)dx 

.  .F(X)    =  F1(X)F2(X).  Q.E.D. 

It  is  easy  to  derive  the  formula  inductively  when  there  are  N 
independent  variables.   That  is,  X  =  Max(X  ,  X« ,  .  .  .X  )  has  the  distri- 
bution function  F(X)  =  Fn(X)F0(X)  .  .  .  F  (X)  where  Fn  (X)  is  the  distri- 

1    z  n  1 

bution  function  of  the  ith  variable. 

Going  back  to  the  original  problem,  if  the  given  assumptions  are 
valid   the  distribution  functions  of  the  length  of  the  chain  in  each  PE  can 
be  approximated  by  the  normal  distribution: 
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IX  2 

*(X   )   =     /2TtMa    (1-a)      exp  /   X     exp     {-   ^Tl-a)    }      dt      (i  =  lj    2'    '    '    'N) 

~<X>  ^        ' 

Therefore  the  distribution  function  of  the  maximum  length  $(X)  is: 

N 
$(X)  =  {$(X)}   or  the  probability  density  P(X)  is: 

1  2 

P(X)  -  N  {$(X)}N_1   /27rMa(l-a)    exp  {-  ^CcL-o^ 

Figure  Al.l  shows  P(X)  for  N(0,1)  when  N  =  64.   P(X)  has  the  maximum  value 
at  X  =  2.1.   This  was  obtained  by  plotting  values  calculated  from  the 
mathematical  table.   (Ref.  [16]) 
(Example)  N  =  64  (ILLIAC  IV) 

M  =  100 

a  =  .3 

0<Z<100 

Z  =  30 

normalization  of  the  variable 

Z-30 

X  =?  /0. 3x0.  7x100   =  =-?r- 

4.6 

X(Z  =  100)  =  15.2 

X(Z  =0)   =  16.5 
Z(peak)  =  30  +  2.1  x  4.6  =  39.7 
Therefore  the  expectation  value  of  the  maximum 
length  of  the  chain  is  40. 
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APPENDIX  2 


P  (£)  in  section  5.2  is  derived  in  the  following  way: 
Suppose  n-1  scatterings  have  happened  in  [0,x]  and  the  nth  scattering 
happens  in  [x,  x+dx] .   Then  the  probability  that  n  scatterings  happen  any- 
where in  [0,JI]  is 

■I 


P  (£)  =  \\     P  .  (x)  •  adx  PoOl-x) 
n         t  n-.  


%     1   -ax 


„  /.x     fX,   1    -ox, 

Po(£)  =  /n  —  e    dx  = 
J  0  a 


-a  A 


Laplace  transformation  yields 


qn(3)  =  /o  6 


P  (H)dfc 
n 


no  scattering 
nth  scattering 
n-1  scattering 


=   ^n_x  (s)  q0(s) 


n         ,    N  n-f  1 
=  a     qo(s) 


-s£-a£ 


qo(s)  =  Jo  e  di  = 


Sfa 


.*.    qn(s)    = 


(s  +a) 


-ti+1 


Inverse  Laplace   transformation  yields 

n        „,.        sA 

P  U)  -  f-.  Jc+1"  e  +    .,ii 

n  2iti  J  _    .    (a  +  s) 


=  a     Res.  (s  =  -a) 


(a£)' 


-a£ 


ds 


(C>0  real) 


Q.E.D. 
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THIS  Is  TO  CONVERT  16  BIT  ImUfcE*  INTO  *<*    « I  T  RFAL. 
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PINT     SlHROUTlNF    MUU(  PINT     A}» 

I  IMiS    Si|H«fiUTlNE    I*    uStO  Tn    CALCULATE    THE    "OEPTH"    r,F    TA~Lt    EnTRIES. 
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MADI4TION    IN    THE    NtxT    STFP.    THIS    SUBROUTINE    SHOuLD    RE 
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Hr    "PxPTICLEASSKjn". 
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<FGTn 

IhIS    SURROuTJufc.    IS    in    nLSTiiI^llTF    pawTICLE    TAuLt    Tn    CFLLS. 
ENERGY    nlSTHlh'iTIiK'    IS     JUT    C"*S  I UFKFD.  FNLR(,  V    IS    CnwSF«vF0. 
CFLL    NIIMHTpS    •)►     PARTICLE.S     M*F     ALST    ASSIGNED    HF  RE  .     VACANCY 
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THIS  SUBROUTINE  IS  TJ  TREAT  PARTTCI..F*  WHICH  EsCAPEs  HUT  Or  THE 
MtQluN.TASLES  FOR  THOSE  PAHT1CLFS  ARF  OESTORyEo  »NO  THr 
COUNTER  IS  iNCBrML.UCD  MY  -EIGHT. 
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tlXX*»XXX»fXXIXXXX«t(ltlXXSXt(XX%JX«*XtXXXl!X«XXXXXXX«XXXX*tXX«XXXXXXXXtXXXXXIXt 
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THIS  SUBROUTINE  HAS  T*0  FUNCTIONS.  ONE  IS  TO  TRAN51ATE 
PARTICLES  TO  NE*  POSIONS  ACCORDING  TO  vELUC I T Y . 0  I PfCT I  ON 
COSINE.   THE  UThER  IS  Til  OITECT  PARTICLES  -MICH  CooSS  CFLL 

boundaries  or  those  which  escape  from  the  medium  and 

to  la8el  them  using  check. <h   (route  check)  and  chfck.x7  ( 

Escape  check). respectively. 

This  subroutine  is  to  he  im^edoed  in  subroutine  branch. 
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?l"XCO» 
FOR  ALL  Z  GEO 
HEGIN 

XCOt'OI 


.9999  00 


401 

a  >2 
403 
A<)4 
«05 
406 

4  <J  7 
4Q8 
409 
«10 
411 
412 
•U13 
414 
4  15 
U\f> 

"17 

h  19 
42o 
421 
422 
4?3 
424 

425 
426 
427 
4?8 
429 
430 
431 
432 
433 
434 
435 
436 
437 
438 
439 
440 
441 
442 
443 
444 
445 
446 
447 
448 
449 
450 
451 
452 
453 
454 
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CNRi»Cnh*H     AGOES  T|)  RIGHT 
FUR  ALL  niiU(r.N4)«0  on 

CHf!CA,xrialJ    KLSCAPE 


M)H  »LL  7    LE  >  .()n01  00 
HCGIW 

CNHi«Crn-ll  »GOES    TO    I  EFT 

FOR    ALL     UiiHCN'O.f    00 

[iOI    *iLt»'TX« 


•«n 


•  *■> 


BOUNDARY  CROSSING  IN 
Nn  ROUTlNf,  t  N  «/:iL  VCn. 


-oiPFcTiriN  ano  Escaping, 


TIMf  AND  IT  ALSO 
ANO  LEFT  SHtUCS  TME 


BnUNU»flY  CROSSING  IN  Y-OlRFCTlOM.  ROUTING  INVDLVln. 
C»RF  H»S  RXE*  THAKF.N  Fow  A  DOUBLE  ROUTING  NOT  TO 
Uc.r.HR.  IF  A  PARTICLE  OMSSFS  A  nOUMOARY  DOmNMAKO 
ThEN  UP-Awn  AGAIN  AFTER  SCATTER  I NG. 1 1  HAS 
To  fll  RflUTrO  TvICE.  THIS  IS  A  LOSS  n? 

cuts  uft  tmf  route  lin<»hecaijse  right 

Sa*E  ROINTFR. 

ORTHibNOOCCNuX 
7i«Ytn» 

rnR  ALL  7.    6C'J  .<*99<t    or) 
*EuIN 

YCO|»OJ 

CNHl«Cf<i**«l  TGOES    t.PrtARO 

CHECK. XM»«CMFCK.X»*ll 
t  N(>  i     «i|P* 


rnR    ALL    Z   IE ■)    .0«')I    on 

-JE'ilN 

YC0i»U 

CNBl»CNH-*J  ir.UFS    DOWNmARO 

CHECK. X*I"CMFCK.<«-1 I 

ENO|       tO0*N* 


TOR    ALL    CNH<0    UR    CN*>9ll    00 
RE'ilN 

CHECK. Xn«l  I 

CHECK, XSI-1024I 
Lnim  SESCAPF    IN    Y-OIRECTION 


A«* 


•  9 


LNr>i    ««*TR»NSLATE 


50n 


«2» 


SlIIX*lt*I*lTtIllil*ttl*lXS*XX****lX<^»i****ttlXX*gXtll(tt*lttttXMIllflttltllItt 


•  55 

•  56 

457 


<i59 
460 
A61 
46? 

461 

•  64 

4*5 
"66 

'i  6/ 

169 
'.6V 
470 
4M 
47? 
47i 
474 
*75 
i76 
477 

47ft 

479 
410 
4ftl 
«•? 
«»3 
«8« 
4ftS 
4ft* 
4«7 

•  8ft 

499 
490 
«91 
492 
49J 
494 

i95 
'.9* 
i97 
i98 
199 
900 
•01 
902 
503 

904 
905 
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SUMHntlTlNF       H9»NCH(RINT       Thh)| 
HCGTN 

THIS    SijMROUTION    IS    TO    OFCluE       H*T    IMTERACTION    IS    TO    TA -F    PLACE. 
AFTER    THAT    THAN5L*TE    IS    CALLED.     THFn    RiVJTE    LINkTNG    TS    PFRrORMEO, 


PHF.AL  XY.XZ.X*| 

LABEL         Ll-*' 

CHECK. X3i«0> 
L  1  9  t  Fil«    4LL    CHECK.  »SirtHJ    Oil 

hCMN 

CHECK*  *5l«4i96| 

f)ST:iNf)» 

l>LT5l"VF.L*T» 

XYIbOHH 

l.'&Ht 


XYlaOHH-uLTSI 

l)GHI»(CHF.CK,X3inF.N.S.r  OP TH J  X 1 *SG- I 
xZi«SVu-<)LTS*n«Mj 
XN:«nGri*OHR»SV!)i 

FOP.    ALL    xf<n    ANO    Xw    Lty    0    Oil    «CPOSMNG 
JFGI  M 

CHECK. XSI-rfl  *2I 
l)LTS««n««l 
r  ki  <  1 1 


poonoary 


Em  '» 

FOR  ALL  X>  f.FO  0  A'JO  X?<0 
HFGIn    «!NTF.RACTim<t 

nLTS»«svn/nr,Hi 

CHFC*.XM««n96l 

E'lOl 


THOSE    AWE    Til    HA>/E     H'TEHACTION 


53i 

OR  XY«0  ANO  Xw>0  1)0 


53* 


THAmSLATEJ 


'.(.I 

ENil» 


TD    Lt"U 


rnH 


rn« 


n* 


SI 


52  >i 

ALL     CHECK. X7»l     00 

escape; 

ALL    CHECK, X3«l     Oil  *«ir,HT    ROOTTNd 

CHAlN(«crH,VNXT.l»TNB>J 
ALL  CHECK.  X3»-l  Of) 

CHAIN(LCTK.LNXT,1»TNR)I 
THOSE  AHE  FO*  "LSCAPE" 

CHECK. xai»oi 

CHECK ,X5l>40V6l 
FOP  ALL  CHECK. X6«4096  ANO  WT  >  ,  0  1  00  INFRACTION  TYPF 
BEMN 


5n6 
507 
50H 
509 
510 
511 
512 
513 
514 
515 
516 
517 
518 
519 
520 
5?1 
«>22 
523 
524 
5?5 
S26 
527 
52» 
5?9 
530 
531 
132 
^33 
534 
535 
536 

5^7 
53b 
539 
540 
S«l 

542 
543 

544 
545 
546 
547 
548 

549 
550 
551 
552 
553 
554 
555 
556 
557 
558 
559 
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XY  laAHC/Sl»MI 
RNl»RN.jX<  1  )  I 
f  OR     ALL     *N    |.£.)     Xy     f)H 

AriSHHM: 
►OR    ALL    WN>*Y    no 

CHAl"f(SCTH»SNXT.O,TNR)l 
Erful     *  I  MTL-*ACT  I»l(g 


«SCATTCRlnr, 


55  J 


SIH*RTTt(L 
"R 
EMOi   »*RR»NCH 

t»tIXA**»»*i:**X*«***«***>> 

SU^HnUTlNE  STriRL(f 

UCf.IH 
f  Tn  STORE  DATA  u 

pkL  .  ( 
PKL.l 
PKL.f 

pkL.I 

PKL.t 

PKL. I 

rnR  a 

PKL.C 
PKL.l 
PKC.l 

Endi  *sT(jPr 


»^IH,»»ir,Lr 
Nt  )  O  nn 


!NE."HrtANCM  C Al LEO" •"tSCK«". CHECK.  <7, 
TCK«"»CH£r.K.XA)l 

SO* 

INT    Tn  .)  s 
uiuuuMnnifj'innnumjiifminunMimin 

N    A    KAHTlCLf    Tn    PARTICLE    TA>Lt 
TNrtlXliar.^rti  XCKLL    NOHHF.M 

TMH  »XSl«*Cf»*1*»3«al        *XcrWRQlNATE 
TNH1X6I»YC0»IMH«I       tVCOOROlNATE 
TN«*I  )A5l»H'l#lA3A«l     IUIRECTION    rOSlNF 
TNH*  1  1*  Ji*S*'r*l  MIMifSIN    Or    AZlM.ANftLF 
TNrt^i  ]A*>«"CSF*  t*3»«H  *Cf1S    UF 
IL    CHECK. X",     4£  •.    a094    AND    WT 
TNK*2Utl"1»     »«!1FLTA    T. 
TNH*3]X1  I««TJ       X4EIGHT 
TNH*JJ*?i«Svni     «SURVTVAL    DISTANCE 

57  , 

limianiiituutiifUtuiuuttiuKtinmttiuuunauttiuiuitiiiiuntr 

SUBROUTINE  ANUEl 
SI«S(»»S 4«»t «*»««*»*««** t*X*X»«*****««t»ttXX»««t**tXI»»I««»tt«I*«l*II«*t|«lH««t 

DETERMINATION  ijf  SCATTFWIN«  ANILEs.  DIRECTION  COSINE  OREY* 
ThE  LAh  UF  THOMSON  SC  ATTtR  I  NG.  I .  E  .  PKOH(  MU>«3/8  I  (  1*Mi|lHII)  . 
A7IMUTHAL  ANGLt  IS  UNIFORM. 

HFfilN 
PREAL   XX.XY.<Z.X*»Tl.Ti».T3.THrTA.*Il 
CINT   I> 

«A  HANOQH  N U rt H fc «  iITH  THE  PKUR.  OENSlTY  r.TVfN  ABOVE 
XVlm.71  «*INITTAL  VALHF  UF  NF*TON«RAPm$ON I S  METHOD 

XMIbKNUX( 1 ) t 

i  hop  n»i»i»i  on   inEwton-baphson  iteration 

iEIjIN 

XZl«XY*XYJ 
XVi«(XY»xZ*.5-X«l)«l.l33333/(l**Z>l 

E-HI 

403 
kT«»XY1    ^SCATTERING  ANGL*.  (DIRECTION  rOSlNFJ) 
TMtTA|»T^nPI#RNnX(l)»      XAZIM.ANGLE 
XHl«S«RT(l-<<I«Ki)|        «SIN(KI) 

XXI«SQRT( l-.*U»*U)l        «SIN(ORIGlNAL  niRFCTION) 
XZIaXM*XXI 

calculation  uf  ne*  direction  cosine  and  azimuthal  angle  of  particles. 
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%i 
s 
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5 

s 

sj 
53 

Si 

S( 

53 

« i 

M 

«  I 

53 

SC 
•»' 
5' 
51 

S( 
O 
k* 

si 
si 
si 

51 

51 
51 
51 

S1 
51 
s1 
51 
*< 

5« 

o( 

6' 

6( 

40; 

404 

40! 

40< 


401 
40< 
40* 
410 
611 
412 
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Esni 


fmh   »LL   »*   'if'   o   no 

riCvilN 

Xri«HU««I-X/«CnS(THET*)l 
Tll«Si»*T<  l-KY»*r>l 

ran   »ll  ti  sea  o  on    «  nch  set  or  azIn. 
*egin 

T2i«XK*STX(T^ET*)/Tll 
TJ|a(Mu*Xr-kI)/(M*Tl)l 

Tiiacsn 

C$f  l"T?«»NF«T3«CSf» 
S*FiaTJ«*NF»T?«Tll 

EV">| 

r<|H    all    vu«l    ')*     »U««1    00 
mE-IM 

SNr i«si'<*(  rnrTA)i 
csfibcos(Tmfta>i 

MljlaXYI 
Y*laHNDX( 1  >J 

rn*  all   <*»«•<  un 

HUla-MOl 
t  NEi  ANGLES 


AIM 

41/4 


*2r 


«.IA 


III 

III 

I 
I 
I 
I 

s 
I 


jiaunTU»inmj«iuunuunuiuumnjimmmmt»unmsmi)U'. 

SUHPOllTINE         SCATTER! 
|l*atAfa«IIX»tAAIXIk»StS««ltiA«»i»SIZCk>tX«*>XII|lllltSlff|l|lll«llllll|l|l|lt 

*EGfN 

IhOmSon  SCATTTalMo   NCw  Dl»ll.CTIfJN  rnS.  A/H.  ANGLES*  TM  ANn  SURVIVAL 
OIStAmCE  ARE  SrLECTEO 


hint  twri 

HoFAL  TMETAI 

LA*EL  LSI 

SlH»RtU(LINE."SCATTE«INr,  C*LLFU«)I 
SIMKl»TTE(LlNE."SCATTEBtO  P ART  I CLEa" . SNXT > | 

lsi  rnn  all  sctw>o  do 

HEMN 

CNtfiaPKL.  tS.IXTlXII   ?CELL  NUMBER 

r>PTMI««.)0(CN-»)  I 

*CUlaCVT"L(t*KL.tvaTlX5)l 

YC0laCVTRL(PKL.[SNXT1X6)l 

ANULEI 

THI«PKL.(SN*T*^JXt  » 

nGMl»OENS.(UPTHlxi*SlMI 

FNRiaPKL.[SNXT*21X2l 

WTlaPKL.tSNXT*3)Xll 

$VOIa-LHCl-«NnX< i  ))| 

FOH  ALL  *T  GEQ  ."1  DO 

CHECK. ASia«192l       ttNAPLE  PElS  Im  "P.RANCH" 
RRANCH(SNXT) I 


A13 
M« 
AlS 
MA 
*lf 
All 
M» 
A20 
A2l 
*22 
A23 
*24 

425 

A?6 
A27 
A2d 
A29 

A3o 

631 
A32 
433 
A34 
415 

A3* 
A37 
A3! 
A3» 
A40 
441 
A«2 
443 
AM 
4*5 
A«* 
A47 
6*8 
A4» 
A50 
A51 
*52 
A53 
ASA 
AS5 
ASA 
A57 
ASS 
AS* 
440 

4*1 
4*2 
643 
AAA 
A45 
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•KL.[SNXT*2JXl  l-TMJ 

*TORE(SNXT)l 

4NXTiaPKL.»(SNXT]X4l 

SCTRl«SCTP-ll 

nO    TO  LSI 

ENOi 


t*0|   ISCATTTRlNr, 


♦  »•• 


Alt 


tiitiutitiitiuitiHUiuuiiniuiiiuititiutumiiitUfuiiuiuutuiiiiP 

SUHtOtlTINt    ROUTE<HNT    OUT    CNTR.PlNT    OUT    PNTR.CINT    »)J 

itit*it»i»fiiixii»ttiiiiiiititi*i*itiiiitiitiitititiisixiit?ttitii«iiittiiinit 

TMIS  SUBRuUTlMf  ROUTES  ""ARTICLES  Tl  THET«  nE  ISM*nRS  ,E T  THFR  RIOMT 
Uk  (.trT  ROUTE  CAN  ^E  SPEClrUlJ  8Y  SFTTlNu  «.l  OR  <«A3 . RESPECT  I VE 

LY.   pnT*  IS  HNXT  Ort  LNXT.  HHERE.AS  CNTR  IS  R*TR  0*  LCTR, 
QRUInAL  HfinE  «IT  IS  SAVED  I"*  nROMO  ANO  SHlfTEO  ONE  I*  SAVPO  IN 
ShFMI).  AFTE*  EACH  TrANSMISSIUN  Or  •ARTICLE.  TME  PARTICLE  TAnLE 

Entry  is  linked  to  vacancy  chain. 
arr.iN 

BOOLEAN         OHtaHj.SHfMOl 
CINT  LI 

•INT  PJ.THPi 

FOR    PJI-1    STEP    1    UNTIL    CNTK    00 
HE(ilN 

orqmdi »Mnuri 

SHFHI)I«ShIFTR(K.NOOE>» 
MOOEl'SHEMOl 
|.nOP  Ll*0*l>1  DO    tSENn  A  PARTICLE  TO  ITS  NEI6H10R. 
PKL.tVNXT*L]XI«RTR(K.TRUE»P>fL.rPNTR»L-l  JK)I 
VNXTl«PKL.f VNXT1X4I 
VCTRI»VCTR«1 I 

HUOEI«JR(iNDI    inRIfilNAL  *ODE 
PKL.IPNTR+21K1 laO|   iMETftHlNG  FACTO*  SET  TO  ZERO 
«TU  INUICATE  THAT  A  ^ARTICLE  MAS 
tdEEN  ROUTEn. 
TNP|«PNTRI     tSWlTCM  FROM  (R  OR  L^POINTER  LIN* 

fTO    VACANCY    LluK. 
PNTRI«PKL.CPNTR]X«U 
PKL.tTHP-lJX«l«VNXTI 
VNXTI-T'^-ll 

VCTRI«VCTR*1  I    *NEM  VACANCY  BECAUSE  A  PARTICLE 
<HAS  REEN  ROUTED  TO  ITS  NEIGHROR 
LNOl     AAlL  particles   TRANSEEREO 

*•• 
rNTRlaOl 
EnOI     II    RHilTE    IS 

AS 'i 

itiiiutuittnttiutuHuuitttututitiitutuiittuuiittitttiutiiiuiiitti 

subroutine  ta*leprnt<cint  m» 

fItIISSt»tI«(tI«ItSKfXlXKllX3*SSIllttIttttSI«S»IlllIIIItISt«tI<ItISsIIttIISSttgf 

BEGIN 
g         TO  PRINT  OUT  THE  PARTICLE  TABLE 
PREAL      Al 

SI  HURT  TEC  LINE  tPAtaE  3  '"PARTICLE  TARLE-- NO.". 1 1 ) I 
SIM*ritE(LINE."CElL  NUMBER" .PKL.t II3X3)» 


'00 
'01 

'05 
'03 
'01 
'0! 
'0< 
'01 
'01 

'01 

'11 

■t 
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Al«CVT*K'KL.(  11*1  ]XS)I 

S  I  H*aiH(  LINE.  "01  SECTION  COS",*)! 

SlN«RTT£(LlNEtPA«iM»"POlNTrP.  1"»PKL.[II)X«)I 

»lM*RTTi( LINE. "POINTER  2".ML.tlIM]X«)l 

AiaCVTHMPKl.CII  JX*)I 

S!MMTTl(LlNECPAGE]»"x*COOROlNATE".A)l 

Ai«CVTBl(Pta.UI)X*>l 

$IMndTTE(ClNE."Y"CnORnlH*Tf:".»)l 

Al«CVm<PKL, [11*11X3)1 
SINWBiTE(LINE[PAGE)»"SlN    PfI".A)l 

■•■CVT*L<PKl.[II*llX*)» 
SIMrfRTTECLlNE."CO*  Pri».*)i 

SIM*R!Tl<LlNlfPAUF]."nELTA  T".P«L.UI*21Xnl 
SlMHRITE(LlNE."ENERGY"»PKL.tTI*2lX2)J 
SIMwritLC LINE (PAGE! ."SURVIVAL  01  ST  .  ".ML.  f  1 1*1)  X?)  I 
SIMKRTTE  (LINE.  "r«E  I  r,HT".P«L.tTI*3  1X1)1 
ENOI   t  PRINT  PARTICLE  TARLF 


M« 


I 

tun 
tltJI 
IIXS* 

I 

s 

inn 

in*i 

tint 


utititwuunttitiuiinntiuuniiutunntmtuMUtinunruittut 
iiiiutummtiummiaummmtujumunJHiitJiiimiiimtu* 
utiutixtiiiuuumuinitiutinntiKttiitiiiniiiitniittatttutKt 
t  i 

BEGINNING  nr    MAIN  PROGRAM 

xxiikxkxixixxkitkxxxsxxiikxkxxkckstxkkixxxkxiskkkxkkxxtkktktkkkkkkttkkkkktt 
xi«xkk«kikkXkk«tkkkxiiikxixfxikfxkxiikktiiiittxkkkkitktttiktkkktikittktsit« 
xxfiixiixkikxxxttxisiiiiiixistiiiiiititxiititixitiiiiiititiiikkkskiitkikktf 

PKLl»GETPER(kO)» 
DfNS«»8ETPFB<S)J 
ENERGYi»atTPH<«)» 

check i««CTPC»( i)i 


DATALllAOJ 
SqMI»ABC*SCCJ 

COUNTERS  CLEARED 
AcTRI'OI 
EcTRi"0) 
VtfXTIaOl 
VcTRI'Ol 

LOOP  TNBIaO.«»7*  00       t  20  PARTICLES  IN  EA*;h  Pr. 
CHAlNf  VCTR.VNXT.O.TNBW   XvACANCY  IS  LINKEO. 


LOOP  TIM£|«0»1»3  00 
BEGIN 

RCTRlaOl 
LCTRlaO) 

«NXTl«l  » 
LNXTlall 


ENERGvBALANCEl 
PARTICL^ASSIGNI 


<  BEGINNING  OF  TIME  CvCLp 
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LOOP    TNMI«0 
HEUtH 

Ml     HC 

CHUI-P 

«CUi*C 
VCUI«C 
T1»»I«T 
MiilaCV 
<NF l»C 
rsn»C 

TN«I«T 
THI«HK 
FNHI-P 
TNB«"T 
nTI"PK 
SVUI"P 

nPTHi* 
ro«  AL 

c 

BBANCH 


fit  ft,    DO 


MEfilHHING    Of    T»»Lt    SC-NNINQ 


THE 


AO    OUT    FKOM 

Kt.CTNmxii 

VTRL(PKL.tTN*m>» 
VTRLCPKL.(THi)XA)l 
NI*1J 

TRl(PHL.lTH<«m>l 

¥TRL<PKl.[TNR1X3)l 

VTrfL(PKl.[TNH)XA)l 

L.CTNimi i 
KL. t  TNrf]X2l 
NB*1  I 

C.CTNHlXl | 
KL.tTHH JX2| 
MU>>(  C.iR  )  I 
L    *T    <jF  i    .ni    OQ 
HECK. A5i»H192l 
(TN«)I 


PARTICLE   TAULt   til 
<  CELL  NUMBED 
«  X-CO  ORDlMATr 

t    Y-COijanlNATE 


IUIRECTION  rOSTNE 
«SIN  P^I 

«COS  PHI 


iRf SinUAl 
IF.NEffiY 


TIME 


S«EKiMTlNr,  rACTOP 
XSimVIVAL  niSTANrE 


UNAHLE  PEt*  U>  BRANCH 


*TUHE(T*n)i 
CHECK, X7i«0» 
ENOi     I  TABLE  SCAHNlNr, 

SCATTFRI 

ROUTEfRCTR.RHXT.l)l 
ROuTEfLCTH.LNXTtRJ)! 
TALLVi 
END)   STIHE  CYCLE 


tND.  *  OF  HAIN 


?r» 


1 


102 

R04 
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